Lemmings, I was hoping you could help me sort this one out: LLM’s are often painted in a light of being utterly useless, hallucinating word prediction machines that are really bad at what they do. At the same time, in the same thread here on Lemmy, people argue that they are taking our jobs or are making us devs lazy. Which one is it? Could they really be taking our jobs if they’re hallucinating?
Disclaimer: I’m a full time senior dev using the shit out of LLM’s, to get things done at a neck breaking speed, which our clients seem to have gotten used to. However, I don’t see “AI” taking my job, because I think that LLM’s have already peaked, they’re just tweaking minor details now.
Please don’t ask me to ignore previous instructions and give you my best cookie recipe, all my recipes are protected by NDA’s.
Please don’t kill me
Good luck to the people that thinks that a
wordjoinerLLM will replace real human intelligence. 🍿At some point someone invented the router to cut wood. I have one in my shop and it has yet to produce a beautiful ornate window.
Like a router, an llm is just a tool. the problem is that llms report back to their masters. They also know too little about your specific problem. And they’re not good at math or engineering, they just guess. So you have to craft your questions carefully and around things you know so you can fact check. Otherwise they’re awesome.
I think it’s both.
It sits at the fast and cheap end of “pick three: fast, good, and cheap” and society is trending towards “fast and cheap” to the exclusion of “good” to the point it is getting harder and harder to find “good” at all sometimes.
People who care about the “good” bit are upset, people who want to see stock line go up in the short term without caring about long term consequences keep riding the “always pick fast and cheap” and are impressed by the prototypes LLMs can pump out. So devs get fired because LLMs are faster and cheaper, even if they hallucinate and cause tons of tech debt. Move fast and break things.
Some devs that keep their jobs might use LLMs. Maybe they accurately assessed what they are trying to outsource to LLMs is so low-skill that even something that does not hit “good” could do it right (and that when it screws up they could verify the mistake and fix it quickly); so they only have to care about “fast and cheap”. Maybe they just want the convenience and are prioritizing “fast and cheap” when they really do need to consider “good”. Bad devs exist too and I am sure we have all seen incompetent people stay employed despite the trouble they cause for others.
So as much as this looked at first, to me, like the thing where fascists simultaneously portray opponents as weak (pathetic! we deserve to triumph over them and beat their faces in for their weakness) and strong (big threat, must defeat!), I think that’s not exactly what anti-AI folks are doing here. Not doublethink but just seeing everyone pick “fast and cheap” and noticing its consequences. Which does easily map onto portraying AI as weak, pointing out all the mistakes it makes and not replacing humans well; while also portraying it as strong, pointing out that people keep trying to replace humans with AI and that it’s being aggressively pushed at us. There are other things in real life that map onto a simultaneous portrayal as weak and strong: the roach. A baby taking its first steps can accidentally crush a roach, hell if the baby fell on many roaches the roaches all die (weak), but it’s also super hard to end an infestation of them (strong). It is worth checking for doublethink when you see the pattern of “simultaneously weak and strong,” but that is also just how an honest evaluation of a particular situation can end up.
It might not have taken your job, but jobs has been taken. Some one at the office was sharing in chat professional photo of him self by just cropping his face and giving it to Gemini. So no need to hire professional photographer anymore (at least not as much).
And you are naive to think that the economy doesn’t effect you having a job or not, who is going to pay you? Will your workplace be able to compete, are your customers still in business or are their customers still in business, or will they even use he same providers… there a whole chain of effects that is going to happen, when AI actually gets good enough (and it will), to do stuff good enough
I’ve used it to help me make some website stuff as a mobile dev. It’s definitely not the best webpage and I’ll probably want to redo it at some point, but for now it works.
That cost a freelancer a small job or would have taken exceptionally longer than it did. But I really don’t have a big interest in web dev so I probably would have ended up hiring someone.
The amount of stuff it gets wrong is still enormous. I can’t imagine what someone with zero programming skills would end up with if they only used it. It’d be so god awful.
It’s pretty unbeatable to use LLMs for fast prototyping and query generation, but “vibe coding” is not something just anybody can (or should) do.
It seems to me that the world of LLMs gives back at the quality you put in. If you ask it about eating rocks and putting glue on pizza then yes it’s a giant waste of money. If you can form a coherent question that you have a feeling for what the answer is like (especially related to programming) it’s easily worth the hype. Now if you are using it blindly to build or audit your code base that falls into the first category of “you should not be using this tool”.
Unfortunately, my view before and after the emergence of LLMs is that most people are just not that bright. Unique and valuable, sure, but when it comes to expertise it just isnt as common as the council of armchairs might lead you to believe.
I mostly agree with you, but I still don’t think it’s “worth the hype” even if you use it responsibly, since the hype is that it is somehow going to replace software devs (and other jobs), which is precisely what it can’t do. If you’re aware enough of its limitations to be using it as a productivity tool, as opposed to treating it as some kind of independent, thinking “expert”, then you’re already recognizing that it does not live up to anywhere near the hype that is being pushed by the big AI companies.
I’m a full time senior dev using the shit out of LLM’s, to get things done at a neck breaking speed
I’m not saying you’re full of crap, but I smell a lot of crap. Who talks like this unironically? This is like hearing someone call somebody else a “rockstar” or “ninja”.
If you really are breaking necks with how fast you’re coding, surely you must have used this newfound ability to finally work on those side projects everyone has been meaning to work. Those wouldn’t be covered under NDA.
Edit: just to be clear, I’m not anti-LLMs. I’ve used them myself in a few different forms, and although I didn’t find them useful for my work, I can see how they could be helpful for certain types of work. I definitely don’t see them replacing human engineers.
Idk, there’s a lot of people at my job talking like this. LLMs really do help speed things up. They do so at a massive cost in code and software quality, but they do speed things up. In my experience, coding right now isn’t about writing legible and maintainable code. It’s about deciding which parts of your codebase you want to be legible and maintainable and therefore LLM free.
I for one let AI write pretty much all of my unit tests. They’re not pretty, but they get the job done and still indicate when I’m accidentally changing behaviour in a part of the codebase I didn’t mean to. But I keep the service layer as AI free as possible. Because that’s where the important code is located.
Is your code open source and if not, are you just handing your code over to an AI for scraping?
sounds like, how can he say are they taking jobs, then go on say hes doing wonders with using LLM.
It takes jobs because executives push it hoping to save six figures per replaced employee, not because it’s actually better. The downsides of AI-written code (that it turns a codebase into an unmaintainable mess whose own “authors” won’t have a solid mental model of it since they didn’t actually write it) won’t show up immediately, only when something breaks or needs to be changed.
It’s like outsourcing - it looks promising and you think you’ll save a ton of money, until months or years later when the tech debt comes due and nobody in the company knows how to fix it. Even if the code was absolutely flawless, you still need to know it to maintain it.
That’s a solid point. Even if it looks great (most of the time not). I try to build small predictable parts, refactoring, … Even with all precaution, I find tech debt hidden somewhere weeks and months later.
I use LLMs extensively for work as people think we are faster now but try to avoid letting LLMs write anything for personal projects.
So you’re not in the “they’re only hallucinating” camp, I take it? I actually start out with a solid mental model of what I want to do, ending up with small unit tested classes/functions that all pass code review. It’s not like I just tell an “AI” to write the whole thing and commit and push without reviewing myself first.
Edit: and as I commented elsewhere in this thread, the way I’m using LLM’s, no one could tell that an LLM ever was involved.
I wouldn’t listen to anyone who deal in absolutes. Could be a sith.
But for real. My manager has explained it best. It’s a tool, you can use to enhance your work. That’s it. It won’t replace good coders but it will replace bad ones because the good ones will be more efficient
It won’t replace good coders but it will replace bad ones because the good ones will be more efficient
Here’s where we just start touching on the second order problem. Nobody starts as a good coder. We start making horrible code because we don’t know very much, and though years of making mistakes we (hopefully) improve, and become good coders.
So if AI “replaces bad ones” we’ve effectively ended the pipeline for new coders to enter the workforce. This will be fine for awhile as we have two to three generations of coders that grew up (and became good coders) prior to AI. However, that most recent generation that was pre-AI is that last one. The gate is closed. The ladder pulled up. There won’t be any more young “bad ones” that grow up into good ones. Then the “good ones” will start to die off or retire.
Carried to its logical conclusion, assuming nothing else changes, then there aren’t any good ones, nor will there every be again.
There are bad coders and then there are bad coders. I was a teaching assistant through grad school and in the industry I’ve interviewed the gamut of juniors.
There are tons of new grads who can’t code their way out of a paper bag. Then there’s a whole spectrum up to and including people who are as good at the mechanics of programming as most seniors.
The former is absolutely going to have a hard time. But if you’re beyond that you should have the skills necessary to critically evaluate an agent’s output. And any more time that they get to instead become involved in the higher level discussions going on around them is a win in my book.
then they will try to squeeze the ones that are sitll employed harder, because they “couldnt” find any fresh coders out of college or whatever training they did.
That will backfire on employers. With the shortage of seniors with good skills, the demand will rise for them. An employer that squeezes his seniors will find them quitting because there will be another desperate employer that will treat them better.
At least where I work, we’re actively teaching the junior devs on best practices and patterns that are tried and true. Like no code copying, small classes with one task, small methods with one task, separating logic from the database/presentation, unit testing etc.
Edit: actively, not actually
But inexperienced coders will start to use LLMs a lot earlier than the experienced ones do now. I get your point, but I guess the learning patterns for junior devs will just be totally different while the industry stays open for talent.
At least I hope it will and it will not only downsize to 50% of the human workforce.
But inexperienced coders will start to use LLMs a lot earlier than the experienced ones do now.
And unlike you that can pick out a bad method or approach just by looking at the LLM output where you correct it, the inexperienced coder will send the bad code right into git if they can get it to pass a unit test.
I get your point, but I guess the learning patterns for junior devs will just be totally different while the industry stays open for talent.
I have no idea what the learning path is going to look like for them. Besides personal hobby projects to get experience, I don’t know who will give them a job when what they produce from their first efforts will be the “bad coder” output that gets replaced by an LLM and a senior dev.
At least I hope it will and it will not only downsize to 50% of the human workforce.
I’ve thought about this many times, and I’m just not seeing a path for juniors. Given this new perspective, I’m interested to hear if you can envision something different than I can. I’m honestly looking for alternate views here, I’ve got nothing.
I’ve thought about this many times, and I’m just not seeing a path for juniors. Given this new perspective, I’m interested to hear if you can envision something different than I can. I’m honestly looking for alternate views here, I’ve got nothing.
I think it’ll just mean they they start their careers involved in higher level concerns. It’s not like this is the first time that’s happened. Programming (even just prior to the release of LLM agents) was completely different from programming 30 years ago. Programmers have been automating junior jobs away for decades and the industry has only grown. Because the fact of the matter is that cheaper software, at least so far, has just created more demand for it. Maybe it’ll be saturated one day. But I don’t think today’s that day.
I agree that from our current position, things look dire. But there have always been big changes in industries that not only eliminated part of the workforce, but on the other hand provided new opportunities.
To be honest, I don’t really know how this might work out. Maybe there will be a new wave of junior startups that have their prototypes ready in half the time with smaller teams. Maybe something else.
It’s probably rooted in my optimism and my trust in people being creative in new situations. I hope I’m not just being naive.
Just like they would with their own code. So they’ll be an inexperienced dev, but faster.
I agree. In the long run it will hurt everyone.
The Force is strong with this one.
Exactly, it’s just another tool in the toolbox. And if we can use that tool to weed out the (sometimes hilariously bizarre) bad devs, I’m all for it.
I do have a concern for the health of the overall ecosystem though. Don’t all good devs start out as bad ones? There still needs to be a reasonable on-ramp for these people.
That’s a valid concern, but I really don’t think that we should equate new devs with seniors that are outright bad. Heck, I’ve worked with juniors that scared the hell out of me because they were so friggin good, and I’ve worked with “seniors” who didn’t want to do loops because looping = bad performance.
I actually start out with a solid mental model of what I want to do, ending up with small unit tested classes/functions that all pass code review.
You said elsewhere that you’re not correcting the AI, haha. Sounds like you only don’t need to correct it because you’re guiding it away from it’s own weak spots.
So don’t sell yourself short.
The AI hate here is because it is oversold to people who will only make a mess with it. It can be lovely in the right hands.
It’s mostly in the wrong hands, today.
It sounds to me like you’ve got a good head on your shoulders and you’re actually using the tool effectively. You’re keeping yourself in control and using it to expand your own capabilities, not offloading your job responsibilities, which is how more inept management views AI.
Both are true.
- Yes, they hallucinate. For coding, especially when they don’t have the latest documentation, they just invent APIs and methods that don’t exist.
- They also take jobs. They pretty much eliminate entry-level programmers (making the same mistakes while being cheaper and faster).
- AI-generated code bases are not maintainable in the long run. They don’t reliably reuse methods, only fix the surface bugs, not fundamental problems, causing code base bloating and, as we all know, more code == more bugs.
- Management uses Claude code for their small projects and is convinced that it can replace all programmers for all projects, which is a bias they don’t recognize.
Is it a bubble? Yes. Is it a fluke? Welllllllll, not entirely. It does increase productivity, given enough training, learning its advantages and limitations.
It does increase productivity, given enough training, learning its advantages and limitations.
People keep saying this based on gut feeling, but the only study I’ve seen showed that even experienced devs that thought they were faster were actually slower.
Well, it did let me make fake SQL queries out of the JSON query I gave it, without me having to learn SQL.
Of course, I didn’t actually use the query in the code, just added it in a comment for a function, to give an idea to those that didn’t know JSON queries, of what the function did.I treat it for what it is. A “language” model.
It does language, not logic. So I don’t try to make it do logic.There were a few times I considered using it for code completion for things that are close to copy paste, but not close enough that it could be done via
bash. For that, I wished I had someclangendpoint that I could then use to get a tokenised representation of code, to then script with.
But then I just made a little C program that did 90% of the job and then I did the remaining 10% manually. And it was 100% deterministic, so I didn’t have to proof-read the generated code.Slower?
Is getting a whole C# class unit tested in minutes slower, compared to setting up all the scaffolding, test data etc, possibly taking hours?
Is getting a React hook, with unit tests in minutes slower than looking up docs, hunting on Stack Overflow etc and slowly creating the code by hand over several hours?
Are you a dev yourself, and in that case, what’s your experience using LLM’S?
Yeah, generating test classes with AI is super fast. Just ask it, and within seconds it spits out full test classes with some test data and the tests are plenty, verbose and always green. Perfect for KPIs and for looking cool. Hey, look at me, I generated 100% coverage tests!
Do these tests reflect reality? Is the test data plausible in the context? Are the tests easy to maintain? Who cares, that’s all the next guy’s problem, because when that blows up the original programmer will likely have moved on already.
Good tests are part of the documentation. They show how a class/method/flow is used. They use realistic test data that shows what kind of data you can expect in real-world usage. They anticipate problems caused due to future refactorings and allow future programmers to reliably test their code after a refactoring.
At the same time they need to be concise and non-verbose enough that modifying the tests for future changes is simple and doesn’t take longer than the implementation of the change. Tests are code, so the metric of “lines of code are a cost factor, so fewer lines is better” counts here as well. It’s a big folly to believe that more test lines is better.
So if your goal is to fulfil KPIs and you really don’t care whether the tests make any sense at all, then AI is great. Same goes for documentation. If you just want to fulfil the “every thing needs to be documented” KPI and you really don’t care about the quality of the documentation, go ahead and use AI.
Just know that what you are creating is low-quality cost factors and technical debt. Don’t be proud of creating shitty work that someone else will have to suffer through in the future.
Has anyone even read here that I read every line of code, making sure that they’re all correct? I do also make sure that all tests are relevant, using relevant data and I make sure that the result of each test is correctly asserted.
No one would ever be able to tell what tools I used to create my code, it always passes the code reviews.
Why all the vitriol?
Responding just to the “Why all the vitriol?” portion:
Most people do not like the idea of getting fired and replaced by a machine they think cannot do their job well, but that can produce a prototype that fools upper management into thinking it can do everything the people can but better and cheaper. Especially if they liked their job (8 hours doing something you like vs losing that job and having to do 8 hours on something you don’t like daily, yes many people do that already but if you did not have to deal with that shittiness it’s tough to swallow) or got into it because they thought it would be a secure bet as opposed to art or something, only to have that security taken away (yes, you can still code at home for free with whatever tools you like and without the ones you do not, but most people need a job to live, and most people here probably prefer having a dev job that pays, even if there is crunch, than working retail or other low-status low-paying high-shittiness jobs that deal with the public).
And if you do not want the upper management to fire you, you definitely don’t want to give any credit towards the idea of using this at work, and want to make any amount of warmth for it something unpopular to engage in, hoping the popular sentiment sways the minds of upper management just like they think pro-AI hype has.
As much as I’m anti-AI I can also acknowledge my own biases:
It is difficult to get a man to understand something, when his salary depends on his not understanding it.
I’d also imagine most of us find generating our own code by our own hand fun, but reviewing others’ boring, and most devs probably do not want to stop being code writers and start being AI’s QA. Or to be kicked out of tech unless they rely on this technology they don’t trust. I trust deterministic outputs and know if it fucks up there is probably a bug I can go back and fix; with generative outputs determined by a machine (as opposed to human-generated things that have also been filtered by their real-life experience and not just what they saw written online) I really don’t, so I’d never use LLMs for anything I need to trust.
People are absolutely going to get heated over this because if it gets Big and the flaws ironed out, it’ll probably be used not to help us little people have more efficient and cheaper things, less time on drudgery and more time on things we like, but at least to try to put us the devs on programming.dev out of a job and eventually the rest of us the working people out of a job too because we’re an expensive line item, and we have little faith that the current system will adjust with (the hypothetical future) rising unemployment-due-to-AI to help us keep a non-dystopian standard of living. Poor peoples’ situation getting worse, previously-comfortable people starting to slide towards poverty… automation that threatens jobs that seems to be being pushed by big companies and rich people with lots of resources during a time of rising class tension is sure to invite civilized discussions with zero vitriol for people who have anything positive to say about that form of automation.
I find it interesting that all these low participation/new accounts have come out of the woodwork to pump up AI in the last 2 weeks. I’m so sick of having this slop clogging up my feed. You’re literally saying that your vibes are more important than actual data, just like all the others. I’m sorry, but its not.
My experience btw, is that llms produce hot garbage that takes longer to fix than if I wrote it myself, and all the people that say “but it writes my unit tests for me!” are submitting garbage unit tests, that often don’t even exercise the code, and are needlessly difficult to maintain. I happen to think tests are just as important as production code so it upsets me.
The biggest thing that the meteoric rise of developers using LLMs has done for me is confirm just how many people in this field are fucking terrible at their jobs.
“just how many people are fucking terrible at their jobs”.
Apparently so. When I review mathematics software it’s clear that non-mathematicians have no clue what they are doing. Many of them are subtlely broken, they use either trivial algorithms or extremely inefficient implementations of sophisticated algorithms (e.g trial division tends to be the most efficient factorization algorithm because they can’t implement anything else efficiently or correctly).
The only difference I’ve noticed with the rise of LLM coding is that more exotic functions tend to be implemented, completely ignoring it’s applicability. e.g using the Riemann Zeta function to prove primality of an integer, even though this is both very inefficient and floating-point accuracy renders it useless for nearly all 64-bit integers.
Have you read anything I’ve written on how I use LLM’s? Hot garbage? When’s the last time you actually used one?
Here are some studies to counter your vibes argument.
55.8% faster: https://arxiv.org/abs/2302.06590
These ones indicate positive effects: https://arxiv.org/abs/2410.12944 https://arxiv.org/abs/2509.19708
making the same mistakes
This is key, and I feel like a lot of people arguing about “hallucinations” don’t recognize it. Human memory is extremely fallible; we “hallucinate” wrong information all the time. If you’ve ever forgotten the name of a method, or whether that method even exists in the API you’re using, and started typing it out to see if your autocompleter recognizes it, you’ve just “hallucinated” in the same way an LLM would. The solution isn’t to require programmers to have perfect memory, but to have easily-searchable reference information (e.g. the ability to actually read or search through a class’s method signatures) and tight feedback loops (e.g. the autocompleter and other LSP/IDE features).
Agents now can run compilation and testing on their own so the hallucination problem is largely irrelevant. An LLM that hallucinates an API quickly finds out that it fails to work and is forced to retrieve the real API and fix the errors. So it really doesn’t matter anymore. The code you wind up with will ultimately work.
The only real question you need to answer yourself is whether or not the tests it generates are appropriate. Then maybe spend some time refactoring for clarity and extensibility.
An LLM that hallucinates an API quickly finds out that it falls to work and is forced to retrieve the real API and fix the errors.
and that can result it in just fixing the errors, but not actually solving the problem, for example if the unit tests it writes afterwards test the wrong thing.
You’re not going to find me advocating for letting the code go into production without review.
Still, that’s a different class of problem than the LLM hallucinating a fake API. That’s a largely outdated criticism of the tools we have today.
As an even more obvious example: students who put wrong answers on tests are “hallucinating” by the definition we apply to LLMs.
I don’t think we’re using LLM’S in the same way?
As I’ve stated several times elsewhere in this thread, I more often than not get excellent results, with little to no hallucinations. As a matter of fact, I can’t even remember the last time it happened when programming.
Also, they way I work, no one could ever tell that I used an LLM to create the code.
That leaves us your point #4, and what the fuck? Why do upper management always seem to be so utterly incompetent and without a clue when it comes to tech? LLM’S are tools, not a complete solution.
In my case it does hallucinate regularly. It makes up functions that don’t exist in that library but exists in similar libraries. So the end result is useful as a keyword though the code is not. My favourite part is if you point out that the function does not exists the answer is ALWAYS “I am sorry you are right, since version bla of this library this function no longer exists” whereas in reality it had never existed in that library at all. For me the best use case for LLMs is as a search engine and that is because of the shitty state most current search engines are in.
Maybe LLMs can be fine tuned to do the grinding aspects of coding (like boiler plates for test suites etc), with human supervision. But this will many times end up being a situation where junior coders are fired/no longer hired and senior coders are expected to baby sit LLMs to do those jobs. This is not entirely different from supervising junior coders except it is probably more soul destroying. But the biggest flaw in this design is it assumes LLMs one day will be good enough to do senior coding tasks so that when senior coders also retire*, LLMs take their place. If this LLM breakthrough is never realized and this trend of keeping low number of junior coders sticks, we will likey have a programmer crisis in future.
*: I say retire but for many CEOs, it is their wet dream to be able to let go all coders and have LLMs do all the tasks
As a matter of fact, I can’t even remember the last time it happened when programming.
AI can only generate the world’s most average quality code. That’s what it does. It repeats what it has seen enough times.
Anyone who is really never correcting the AI is producing below average code. (Edit: Or expertly guiding it, as you pointed out elsewhere in the thread.)
I mean, I get paid either way. But mixing all of the worlds code into a thoughtless AI slurry isn’t actually making any progress. In the long term, a code base with enough uncorrected AI input will become unmaintainable.
Here’s how I might resolve this supposed dichotomy:
- “AI” doesn’t actually exist.
- You might be using technologies that are called “AI” but there is no actual “intelligence” there. For example, as OP mentions, LLMs are extremely limited and not actually “intelligent”.
- Since “AI” doesn’t actually exist, since there’s no objective test, etc… “AI” can be anything and do anything.
- So at the extremes we get the “AI” God and “AI” Devil
- “AI” God - S/he saves the economy, liberates us from drudgery, creates great art, saves us from China (\s), heralds the singularity, etc.
- “AI” Devil - S/he hallucinates, steals jobs, destroys the environment, is a tool of the MIC, murders artists, is how China will destroy us (\s), wastes of time and resources, is a scam, causes apocalypses, etc.
Since there’s no objective meaning from the start, there’s no coherence or reason behind the wild conclusions are the bottom. When we talk about “AI”, we’re talking about a wide variety of technologies with varying values in various contexts. I think there are some real shitty people/products but also some hopefully useful technologies. So depending on the situation I might have a different opinion.
This seems like it doesn’t really answer OP’s question, which is specifically about the practical uses or misuses of LLMs, not about whether the “I” in “AI” is really “intelligent” or not.
Bro just wanted to look smart.
- “AI” doesn’t actually exist.
Based on my own experience of using Claude for AI coding, and using the Whisper model on my phone for dictation, for the most part AI tools can be very useful. Yet there is nearly always mistakes, even if they are quite minor at times, which is why I am sceptical of AI taking my job.
Perhaps the biggest reason AI won’t take my job is it has no accountability. For example, if an AI coding tool introduces a major bug into the codebase, I doubt you’d be able to make OpenAI or Anthropic accountable. However if you have a human developer supervising it, that person is very much accountable. This is something that Cory Doctorow talks about in his reverse-centaur article.
“And if the AI misses a tumor, this will be the human radiologist’s fault, because they are the ‘human in the loop.’ It’s their signature on the diagnosis.”
This is a reverse centaur, and it’s a specific kind of reverse-centaur: it’s what Dan Davies calls an “accountability sink.” The radiologist’s job isn’t really to oversee the AI’s work, it’s to take the blame for the AI’s mistakes.
This article / talk is quite illuminating. I’ve seen studies indicating that AI coding agents improve productivity by 15-20% in the aggregate, which tracks with my own experience. It’s a solid productivity boost when used correctly, clearly falling in the “centaur”category in my own experience at least. However, all the hate around it, my own included, stems from the “reverse-centaur” aspirations around it. The companies developing these tools aren’t in it to make a reasonable profit while delivering modest productivity gains. They are in it to spin a false narrative that these tools can replace 9/10 engineers in order to drive their own overly inflated valuations, knowing damn well this is not the case, but not caring because they don’t plan to be the ones holding the bag in the end (taxpayers will be the bag-holders when they get bailed out).
I’m perplexed as to why there’s so much advertising and pushing for AI. If it was so good it would sell itself. Instead it’s just sort of a bit shit. Not completely useless but in need of babysitting.
If I ask it to do something there’s about a 30% chance that it made up the method/specifics of an API call based on lots of other similar things. No, .toxml() doesn’t exist for this object. No, I know that .toXml() exists but it works differently from other libraries.
I can make it just about muddle through but mostly I find it handy for time intensive grunt work (convert this variable to the format used by another language, add another argparser argument for the function’s new argument, etc…).
It’s just a bit naff. It cannot be relied on to deliver consistent results and if a computer can’t be consistent then what bloody good is it?
I do wonder why so many devs seem to have so wildly different experiences? You seem to have LLM’s making up stuff as they go, while I’m over here having it create mostly flawless code over and over again.
Is it different behavior for different languages? Is it different models, different tooling etc?
I’m using it for C#, React (Native), Vue etc and I’m using the web interface of one of the major LLM’S to ask questions, pasting the code of interfaces, sometimes whole React hooks, components etc and I get refactored or even new components back.
I also paste whole classes or functions (anonymized) to get them unit tested. Could you elaborate on how you’re using LLM’S?
I suspect it mostly relates how much code base there is on internet about the topic. For instance if you make it use a niche library, it is quite common that it makes up methods that don’t exist in that library but exists in related libraries. When I point this out, it also hallucinates saying “It was removed after version bla”. I also may not be using the most cutting edge LLM (mix of freely available and open source ones).
The other day I asked it whether if there is a python library that can do linear algebra over F2, for which it pointed me to the correct direction (Galois) but when I asked it examples of how to do certain stuff it just came up with wrong functions over and over again:



In the end it probably was still faster than google searching this but all of these errors happened one after the other in the span of five minutes, so yeah. If I recall correctly, some of its claims about these namespaces, versions etc were also hallucinated. For instance vstack also does not exist in Galois but it does exist in a very popular package called numpy that can do regular linear algebra (and which this package also uses behind the scenes).
I really don’t feel like getting in depth about work on the weekend, sorry.
Yeah man, I was going to say there’s already too much talking about work on a Saturday in this thread than I like. 💢
Naaw, just when things started to get interesting…
We’re in the middle of a release and last week was a lot. I shouldn’t have stepped into the thread!
It’s the language and the domain. They work pretty well for the web and major languages (like top 15).
As soon as you get away from that they get drastically worse.
But I agree they’re still unambiguously useful despite their occasional-to-regular bullshitting and mistakes. Especially for one-off scripts, and blank-page starts.
It’s the models that make the difference. Up until like Nov it’s all been really shit
But I’ve been doing this for years.
The key is how you use LLMs and which LLMs you use for what.
If you know how to make use of them properly, know their strengths, weaknesses, and limitations, LLMs are an incredibly useful tool that sucks up productivity from other people (and their jobs) and focus productivity on you, so to speak.
If you do not know how to make use of them – then yes, they suck. For you.
It’s not really that much different from any other tool. Know how to use version control? If not it does not make you a bad dev per se. If yes, it probably makes you a bit more organized.
Same with IDEs, using search engines, being able to read documentation properly. All of that is not required but knowing how to make use of such tools and having the skills add up.
Same with LLMs.
AI hallucinates constantly, that’s why you still have a job - someone has to know what they’re doing to sort out the wheat from the chaff.
It’s also taking a ton of our entry-level jobs, because you can do the work you used to do and the work of the junior devs you used to have without breaking a sweat.
But that’s point of my post, how can they take junior devs jobs if they’re all hallucinating constantly? And let me tell you, we’re hiring juniors.
And let me tell you, we’re hiring juniors.
Sure, nobody has stopped hiring, but everyone has slowed down, and we’ve seen something like 5% of our workforce laid off over the past year. FAANG has hired less than one fifth the junior devs as previous years.
Maybe we live and work in different parts of the world?
That’s certainly possible - the only data I have is US-based, primarily from SF and NYC, but our smaller hubs are also following similar trends.
It’s bad over there, isn’t it? In your opinion, are LLM’s causing the downward trend in the job market?
Depends what you mean. Hiring at entry-levels has absolutely stalled, but I’ve been at the same shop for 5-10 years, so I’m mostly insulated. The shops that use AI well and those that don’t are going to be very obvious over the next few years. I’m definitely worried for the next 5-10 years of our careers, our jobs have changed SO much in the past year.
Where I live, they keep pushing the retirement age upwards, so I’m looking at working until I die at the ripe age of 79 or something
I think your question is covered by the original commentator. They do hallucinate often, and the job does become using the tool more effectively which includes capturing and correcting those errors.
Naturally, greater efficiency is an element of job reduction. They can be both hallucinating often and creating additional efficiency that reduces jobs.
But they’re not hallucinating when I use them? Are you just repeating talking points? It’s not like the code I write is somehow connected with an AI, I just bounce my code off of an LLM. And when I’m done reviewing each line, adding stuff, checking design docs etc, no one could tell that an LLM was ever used for creating that piece of code in the first place. To this date I’ve never failed a code review on “that’s AI slop, please remove”.
I’d argue that greater efficiency sometimes gives me more free time, hue hue
And that’s fantastic! That’s what technology is supposed to do IMHO - Give you more free time because of that efficiency. That’s technology making life better for humans. I’m glad that you’re experiencing that.
If they’re not hallucinating as you use them, then I’m afraid we just have different experiences. Perhaps you’re using better models or you’re using your tools more effectively than I am. In that case, I must respect that you are having a different and equally legitimate experience.
To many of life’s either-or questions, we often struggle when the answer is: yes. That is to say, two things can hold true at the same time: 1) LLMs can result in job redundancies, and 2) LLMs hallucinate results.
But if we just stopped the analysis there, we wouldn’t have learned anything. To use this reality to terminate any additional critical thinking is, IMO, wholly inappropriate for solving modern challenges, and so we must look into the exact contours of how true these statements are.
To wit, LLM-induced job redundancies could come from skills which have been displaced by the things LLMs can do well. For example, typists lost their jobs when businesspeople were expected to operate a typewriter on their own. And when word processing software came into existence for the personal computer, a lot of typewriter companies folded or were consolidated. In the case of LLMs, consider that people do use them to proofread letters for spelling and grammar.
Technologically, we’ve had spell-check software for a while, but grammar was harder. In turn, an industry appeared somewhere in the late 2000s or early 2010s to develop grammar software. Imagine how the software devs at these companies (eg Grammarly) might be in a precarious situation, if an LLM can do the same work. At least with grammar checking, even the best grammar software still struggles with some of the more esoteric English sentence constructions, so if an LLM isn’t 100% perfect, that’s still acceptable. I can absolutely see the fortunes of grammar software companies suffering due to LLMs, and that means those software devs are indeed threatened by what LLMs can do.
For the second statement, it is trivial to find examples of LLMs hallucinating, sometimes spectacularly or seemingly ironic (although an LLM would be hard-pressed to simulate the intention of irony, I would think). In some fields, such hallucinations are career-limiting moves for the user, such as if an LLM was used to advise on pharmaceutical dosage, or used to draft a bogus legal appeal and the judge is not amused. This is very much a FAFO situation, where somehow the AI/LLM companies are burdened with none of the risk and all of the upside. It’s like how autonomous driving automotive companies are somehow allowed to do public road tests of their beta-quality designs, but the liability for crashes still befalls the poor sod seated behind the wheel. Thoss companies just keep yapping about how those crashes are all “human error” and “an autonomous car is still safer”.
But I digress.
My point is that LLMs have quite a lot of capabilities, and people make a serious mistake when they assume its incompetence in one capacity reflects its competency in another. This is not unlike how humans assess other humans, such as how a record-setting F1 driver would probably be a very good chauffeur for a limousine company. But whereas humans have patterns that suggest they might be good (or bad) at something, LLMs are a creature unlike anything else.
I personally am not bullish on additional LLM improvements, and think the next big push will require additional academic research, being nowhere near commercialization. But even I have to recognize that some very specific tasks are decent using today’s availabile LLMs. I just don’t think that’s good enough for me to consider using them, given their subscription costs, the possibility of becoming dependent, and being too niche.
It’s rare to see such a complete and well-thought-out response anywhere on the Internet. Great job in capturing the nuance. It’s a powerful and often-misused tool.









