For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | zthrowaway's commentsregister

Can definitely attest to this. The frequency of outages at my company have increased drastically the past year, especially ever since incorporating agentic development. I’m seeing all of the dev best practices go out the window. We have a few vibe coders that are posting 15-30 PR’s per day. It’s way too much for us to review. We’re not a big shop. I think we’re going to have to hire more people just to review code across the industry. And those people will have to know how to actually write software otherwise what are they even reviewing. Maybe the models will get so good they never make a mistake. Doubt it.

I wonder if the PR workflow is just unsustainable in the agentic era. Rather than review every new feature or bug fix, we would depend on good test coverage, and hold developers accountable for what they ship.

The result might be more faulty code getting merged, but if you already have outages and can't review every PR, is there currently a meaningful benefit to the PR workflow?


This is the "if you're already letting faults through, why not give up trying to stop faults?" approach.

The alternative might be "what if we could get the genie back into the bottle?"

We know some people are using LLMs to evaluate PRs, the only question is who, and how strong the incentive is for them to give up.


Diogenes carrying a lamp, looking for good test coverage

Copy-pasting screenshots of red lines.

> I wonder if the PR workflow is just unsustainable in the agentic era. Rather than review every new feature or bug fix, we would depend on good test coverage, and hold developers accountable for what they ship.

I think what you're describing is setting up the human as the fall guy for the machine.


So taking responsibility for the code you generate is being a "fall guy?"

> So taking responsibility for the code you generate is being a "fall guy?"

Yes, if your boss expects you to use AI agents to generate code faster than you can reasonably understand and review it. You're stuck between a rock and a hard place: you're "responsible," but if you take the time to actually be responsible you'll be reprimanded. The environment pushes you to slack on reviews in the short term to keep your head above water, but when a problem happens because of that you'll be blamed for it.


This reminds me a bit of monoliths vs microservices. People would see microservices as the next new shiny thing and bring it with them to their next job, or read a great blog post that sounds great in theory, but falls apart in practice. People would see it as as purely architectural decision. But the reality was that you had to have the organizational structure to support that development model or you'd find out that it just doesn't scale the way you expect and introduces its own sets of problems. My experience is that most teams that didn't have large orgs got bogged down by the weight of microservices (or things called "microservices"). It required a lot of tooling and orchestration to manage. But there was this promise that you could easily just rewrite that microservice from scratch or change languages and nobody would notice or care.

LLM-generated code feels the same. Reviewing LLM-generated code when it's in the context of a monolith is more taxing than reviewing it in the context of the microservice; the blast radius is larger and the risk is greater, as you can make decisions around how important that service actually is for system-wide stability with microservices. You can effectively not care for some services, and can go back and iterate or rewrite it several times over. But more importantly, the organizational structures that are needed to support microservice like architectures effectively also feel like the organizational structures that are needed to support LLM-generated codebases effectively; more silo-ing, more ownership, more contract and spec-based communication between teams, etc. Teams might become one person and an agent in that org structure. But communication and responsibilities feel like they're require something similar to what is needed to support microservices...just that services are probably closer in size to what many companies end up building when they try to build microservices.

And then there are majestic monoliths, very well curated monoliths that feel like a monorepo of services with clear design and architecture. If they've been well managed, these are also likely to work well for agents, but still suffer the same cognitive overhead when reviewing their work because organizationally people working on or reviewing code for these projects are often still responsible for more than just a narrow slice, with a lot of overlap with other devs, requiring more eyes and buy-in for each change as a result.

The organizational structures that we have in place for today might be forced to adapt over time, to silo in ways that ownership and responsibility narrow to fit within what we can juggle mentally. Or they'll be forced to slow down an accept the limitations of the organizational structure. Personal projects have been the area that people have had a lot of success with for LLMs, which feels closer to smaller siloed teams. Open-source collaboration with LLM PRs feels like it falls apart for the same cognitive overhead reasons as existing team structures that adopt AI.


The proposed industry solution is to use agents to review PRs, as not to slow down the velocity of delivery...

My current workplace is going through a major "realignment" exercise to replace as many testers with agents as humanely possible, which proved to be a challenge when the existing process is not well documented.


The fact that anyone in leadership would ever think this is even remotely possible - given my experience in the general state of requirements / contracts / integrations / support - makes me bleed from my earholes just a little bit.

It's starting to just feel a little like an excuse to call everyone on deck for "a few weeks trying 9-9-6". But even then the lack of traction isn't between the eyeballs and the deployment. You'll still be spinning wheels in that slippery stuff between what a customer is thinking and what the iron they bought is doing.


So you essentially trust the output of the model from beginning to end? Curious to know what type of application you're building where you can safely do that.

Edit: to clarify, I know these models have gotten significantly better. The output is pretty incredible sometimes, but trusting it end to end like that just seems super risky still.


I guarantee you it's nothing quantifiable.

LLMs can't be responsible for deciding what code you use because they have no skin in the game. They don't even have skin.

If you type fast, well then it takes just as long to code it yourself as review it. Plus you actually get flow time when you're coding.

For heaven's sake people have the robot write your unit tests and dashboards, not your production code. Otherwise delete yourself.


"Hey Claude, did Claude do a good job?"

I did an experiment today, where I had a new Claude agent review the work of a former Claude agent - both Opus 4.6 - on a large refactor on a 16k LOC project. I had it address all issues it found, then I cleared context, and repeated. Rinse and repeat. It took 4 iterations before it approached nitpicking. The fact that each agent found new, legitimate problems that the last one had missed was concerning to me. Why can’t it find all of them at once?

You're expecting it to be a person. It's not.

It is more like a wiggly search engine. You give it a (wiggly) query and a (wiggly) corpus, and it returns a (wiggly) output.

If you are looking for a wiggly sort of thing 'MAKE Y WITH NO BUGS' or 'THE BUGS IN Y', it can be kinda useful. But thinking of it as a person because it vaguely communicates like a person will get you into problems because it's not.

You can try to paper over it with some agent harness or whatever, but you are really making a slightly more complex wiggly query that handles some of the deficiency space of the more basic wiggly query: "MAKE Y WITH NO ISSUES -> FIND ISSUES -> FIX ISSUE Z IN Y -> ...".

OK well what is an issue? _You_ are a person (presumably) and can judge whether something is a bug or a nitpick or _something you care about_ or not. Ultimately, this is the grounding that the LLM lacks and you do not. You have an idea about what you care about. What you care about has to be part of the wiggly query, or the wiggly search engine will not return the wiggly output you are looking for.

You cannot phrase a wiggly query referencing unavailable information (well, you can, but it's pointless). The following query is not possible to phrase in a way an LLM can satisfy (and this is the exact answer to your question):

- "Make what I want."

What you want is too complicated, and too hard, and too unknown. Getting what you are looking for reduces to: query for an approximation of what I want, repeating until I decide it no longer surfaces what I want. This depends on an accurate conception of what you want, so only you can do it.

If you remove yourself from the critical path, the output will not be what you want. Expressing what you want precisely enough to ground a wiggly search would just be something like code, and obviates the need for wiggly searching in the first place.


People pushing dozens of PRs per day need to learn to prioritize tasks, and balance a bit more towards quality over quantity.

This is the way. There's nothing inherently wrong with using AI as long as it's used responsibly.

I highly doubt there are any managers or executives who care how AI is precisely used as long as there are positive results. I would argue that this is indeed an engineering problem, not an upper management one.

What's missing is a realistic discussion about this problem online. We instead see insanely reckless people bragging about how fast they drove their pile of shit startup directly into the ground, or people in denial loudly banging drums to resist all forms of AI.


And maybe spend some time doing reviews for other developers. And if they aren't qualified to be, then maybe spend that time becoming qualified rather than pumping out more slop.

Sounds like people need to speak up to management

Management doesn’t care. This sort of thing is becoming more common at my workplace too. More outages, more embarrassing bugs, even bugs that leak customer data. The solution is always more AI, and if you’re still shipping bugs and causing outages, it’s because you did’t use the AI correctly. Leadership makes all the right noises about quality and ownership, but when it comes down to it, the incentive structures clearly prioritize shipping things faster, all else be damned.

Sounds like a fast track to sinking their company into the ground

Management wants to get rid of people; they want to have their "wish-machine" that does what they say without any need to deal with nerds or ethical issues.

Management likes how fast features are getting deployed so they essentially told us to just deal with it.

I mean speak up to management in a way where they know it's stupid and they're stupid for pushing it

Maybe it’s time to have multiple agents and models review the PRs and also provide context for easier human review. That and lots more focus on robust testing.

There’s no way velocity will decrease now that upper management is obsessed with AI.


I really think that software in general is getting buggier, with ChatGPT/Claude being some of the buggiest software I use. I constantly run into quality issues there and I've reported at least a dozen bugs to ChatGPT this year. One kicker I found recently was that Codex PR Reviews, once turned on for a repo, cannot be turned off - I got escalated to engineering who confirmed that they forgot to add a feature to disable code reviews.

Honestly by the time it gets to review it should be rock solid, so the only thing the reviewer has to think about is the big picture and never “does this actually do what it’s supposed to without abusing any of the interfaces”. Vibe coding makes solid validation, testing, and documentation trivial. The onus of proving your code is good needs to shift downward, not upward. And straight up vibing it is absolutely a terrible idea for anything other than a demo or a simple tool.

Don’t blame the public. If only BOTH sides didn’t give us clowns to vote for. I hope the dems prop up some solid candidates next time because the past 8 years has been a total joke.

A large root of that problem is that Americans have been successfully sold on the false idea that getting to choose every few years between one of two candidates chosen by wealthy donors is democracy (or even government as a public trust).

And please don't say 'third parties'. The two major parties enjoy overwhelming structural advantages. Third parties are crippled even before they get started, and sabotaged if they show any signs of life. For example, in 2024, the No Label movement, whose sole intent was to provide a reasonable alternative to the major party nominees for President, was targeted and in the end never even got a nominee.

https://www.reuters.com/world/us/biden-allies-plot-thwart-th...


Public should be blamed.

A rock that sits on the ground and does nothing would have been a better President than Trump who campaigned on actively harming our economy with tariffs then did exactly what he said he'd do and look where we are now


I use to blame voters too, then I started organizing for a decade and now I rarely blame voters. I always blame the politicians and political parties instead.

When the democratic party was the party of workers (the new deal coalition), they had super majorities in both houses of Congress for multiple lifespans (1930s to 1990s for the House of Reps alone) but for the last 50 years the party has slowly become more corporate and corporate politicians campaign a certain way. A certain way that ignores the material needs of people, by ignoring these material needs they're left to campaign on culture issues. Culture issues are very finicky. For example, it's not hard to find people that like solar panels, believe in carbon taxes, want a green new deal, but also believe in abortion and are evangelical. Since both parties no longer cater to workers, they're left to chase after cultural issues.

Not going to write a whole blog post out but hopefully you know where I'm going with this: the only way to truly win back sustainable power is provide real systematic needs for every American. Needs like providing medicare for all, universal childcare, universal college/vocational training, public housing, and a public jobs programs. All these issues poll at well over >60+%, across party lines too; but they all require more taxes against corporations + elites.

Once you build a party for workers, you can do actual sustainable systematic change; but you lose this contract once you betray the workers.

Last time we did this we put a man on the moon, imagine what we could truly do with the public backing you with todays advances?

If you can't convince people to vote for you, you have to change your platform. People don't really care about neoliberalism so repackaging it as abundance just means every election is a coin flip; it also doesn't help that the democratic party leadership is just as unpopular as Trump because party members, myself included, see how weak and useless they are but somehow always have enough muster to provide corporate welfare or engage in imperialism.


> Needs like providing medicare for all, universal childcare, universal college/vocational training, public housing, and a public jobs programs

Do you remember how divisive the ACA was? Or how the Republicans have been threatening to gut it for over a decade now? They literally do not support ANY of the things that you mentioned.

You can both-sides this all day and night, but it does not reflect reality, where large swaths of unpopulated land get disproportionate representation in Congress.


There’s no solid evidence of that. In some aspects they can be a net positive sure but it’s modest.

Tons of evidence for it:

> The average H-1B household contributes $30,050 net annually — 2.6 times the $11,530 contribution of a typical U.S. household. At the state and local level, governments see a net average fiscal gain of $5,040 per H-1B household, with H-1B workers generating positive fiscal balances in 49 states. The fiscal benefits of the H-1B program are not exclusive to high-income states. The low-income state of Mississippi, for example, nets $4,600 per H-1B household — a figure that is higher than those of 21 other states.

https://eig.org/fiscal-impacts-h1bs/

> Despite its relatively constrained scale, the H-1B program has delivered economic returns far exceeding its original scope. Even under severe capacity limitations, estimates suggest the program generates $7.5–$31.8 billion in annual net benefits. Native workers experience wage gains rather than losses, while companies winning H-1B lotteries achieve higher job growth, productivity, and profit margins compared to similar firms denied visas.

https://www.csis.org/analysis/practical-h-1b-reforms-serve-u...

The fact is H1B workers are often more educated and better skilled than US citizens. In tech we care less about things like masters degrees and phds but the fact is H1Bs are more likely to have those and more likely to be appropriately skilled for jobs than US citizens. In general they are also richer than the avg US citizen in their society (that's how they can afford to move here for an advanced degree despite the currency exchange working against them)


We have the most important tech market in the world, extremely over represented by both usage and revenue just because of our ability to attract top talent from everywhere else, and that seems modest to you?

Microsoft shops. Lots of C# devs gravitate to it naturally. I’m glad I abandoned the MS stack over a decade ago.


.NET Core runs just as well on ECS though. And C# tooling is rock solid in VS Code on Mac. No need to touch Azure or Windows.


This. Please please make this loop!


UK doing UK things per usual.


unfortunately, we have the problem in a few places in the US as well.

florida and texas in particular. [0]

last year florida has at least 2,300 instances of book bannings and texas had at least 1,700.

its wild to watch this all happen so quickly.

[0] https://pen.org/book-bans/book-ban-resources/ (if you scroll down to the map it shows how many instances of book bannings by state)


They are more or less importing this from the US, where this has been going on for a time already, especially at school libraries.


I think part of this can be attributed to prolonged gut inflammation caused by toxins and parasites. There’s something like 60% of the population has some form of parasite and have no idea, which causes a lot of inflammation and problems. Problems that don’t necessarily point to the gut being the culprit on the surface. So it’s misdiagnosed a lot.

I recommend everyone do a gut cleanse once a year.


CDC estimates about 60 million are effected by parasites in USA. which is about 17 or 18%.

Gut cleanse, colon cleanse, detoxing. None of this is supported by science. Nor would any of these things cure, prevent or in anyway help a parasitic infection.

Here are some common parasitic infections and how they're treated. None of these treatments recommend gut cleanse. https://en.wikipedia.org/wiki/Giardia#Infection https://en.wikipedia.org/wiki/Toxoplasma_gondii#Treatment https://en.wikipedia.org/wiki/Ascariasis#Treatment https://en.wikipedia.org/wiki/Hookworm_infection#Treatment https://en.wikipedia.org/wiki/Pinworm_infection#Treatment

Gut inflammation can be a problem, but I would not recommend treating it or even diagnosing it without evidence.


I was thinking of the 60% global statistic. https://www.sciencedirect.com/topics/immunology-and-microbio...

I had gut dysbiosis for the past couple of years. Went to an alternative/func doctor and she helped me do a program such as this, in a safe manner: https://www.gutprotocols.com/products/full-moon-kit-parasite...

While yes this isn’t scientifically backed, because there’s just no clinical trials yet, doesn’t mean it is bunk. I did a program myself and it fixed all of my problems. My stool inflammatory markers went down drastically, as did my myriad of symptoms that caused me issues every day.

Perhaps I was wrong in strongly recommending people just go do this randomly without any doctor oversight. Whatever. I just wanted to offer my experience because it helped me and can help others. Take it or leave it.


"It is estimated that the global prevalence of some of these diseases already exceeds 60% among the more than three billion people living in parasite endemic areas." https://www.sciencedirect.com/topics/immunology-and-microbio...

This is an interesting statistic, but it is not a global statistic.


Gut cleanses are probably stupid but I wonder if people would benefit from taking antiparasitics prophylactically. It's not something I've ever done, but I eat sashimi pretty regularly and wonder if I should take something like praziquantel because I'm probably at risk for Japanese broad tapeworm, and the symptoms are mild enough I can't really tell without testing, but the price of actually testing is much higher than just taking a drug with a great safety profile.

For similar reasons, I also wonder about people who consume raw milk. These people are more likely to endorse ivermectin for e.g. covid, because it made them feel much better. Maybe it's possible these people aren't lying about that, but not because it cured their covid.


Gut cleanses is really just a herbal medicine protocol you do for a few weeks. Herbal medicine is not stupid, it has been used for thousands of years. Hell even some pharmaceutical drugs use herbs.


Gut cleanses are just marketing. Occasionally eating healthy and then going back to regular unhealthy diet skews the middle point of gut health.


Well for me it killed the parasites I had plaguing me and cured a lot of sickness I was experiencing. To each his own.


If parasites was the concern then countries like Bangladesh would have incredibly higher rates given that people there tend to have orders of magnitudes more parasites than anywhere in the developed world.

And I’m not sure what toxins is supposed to mean and how Americans are more exposed to toxins than developing world children scouring through our electronic garbage on a daily basis


Bangladesh actually had one of the highest parasite rates among children until the last decade.

Parasites are quite a global problem: https://www.sciencedirect.com/topics/immunology-and-microbio.... I don’t even know why we’re arguing this.

Now we don’t know what toxins are? Really?


What is a gut cleanse? That sounds destructive.

Doing an ambiguous preventive activity on 1 out of 365 days doesn't sound effective.


These sacrificial two-days-on-the-toilet offerings are like giving confessions to the priest to get back on the good side so you don't have to change your behavior.

Yes I can eat this 4200cal Costco pizza, I did my cleanse last month.


What do you mean when you say “do a gut cleanse”?


From the company that brought you the Lung Brush.


I mean a program (series of herbal supplements) that can cure gut inflammation and help with digestion, and in some circumstances get rid of parasites. Herbal medicine, things that other cultures have been doing for thousands of years. Do I have to prove that turmeric lowers inflammation in the body? What is going on in this thread lol.

“gut cleanse” obviously is a trigger phrase on HN it seems.


Don’t have to prove it to me… I was asking genuinely.


fenben, ivermect. herbs like blackseed oil, blac walnut, even organic cloves


I sacrifice a houseplant to baphomet as an alternative


This should surprise no one. A CIA-backed VC was one of the first investors of Google. Big tech will always serve the powers that be. Employees that think their letters of appeal will do anything live in a fantasy land. That’s not how the real world works.


What is wrong with a company serving the country in which it operates?


Engineering Ethics is a standard required class in any engineering discipline and a whole field of discussion. The ethics of working on military stuff (or even just government stuff) is nowhere near as cut and dry as your question seems to imply.

For example:

- What if the country asked you to develop technology to track and hack journalists or political rivals the administration doesn't like?

- What if the country asked you to develop chemical weapons? Is it different if the weapons would be used on their own population or only on external "enemies"?

- What if the country asked you to personally assassinate a civilian of another country? What if they asked you to create a program that would do that? What if they asked you to simply create a list of targets, and you knew they'd be assassinated?

- What if the country asked you to build something in an unsafe way that you're pretty certain will cause harm to people?

- What if the country asked you to make a public statement lying about the purpose behind what you're building?


Surely that depends heavily on the country.


The country in question is the United States of America. You know, the one that Iranian Islamic Republic officials lead chants of "Death to America" about.

The US is not perfect, but this disparagement of the US for the benefit of the Islamic Republic is disgusting. As is the online bullying of people who stand up for the US.


Just because there one or maybe several bad/worse countries in the world, that doesn't mean anything goes ethically. That's a dangerous line of reasoning.


Stores his production TF state on his local computer…

I don’t think AI is to blame here.


Yup. Amazon doubled their workforce through the pandemic. I think a lot of tech companies are still cutting fat from those days.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You