For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | AstroBen's commentsregister

This was due to Claude Code the agent harness. 4.6 was trained to use tools and operate in an agent environment. This is different from there being a huge bump in the underlying model's intelligence.

The takeaway here I think is that the "breakthrough" already happened and we can't extrapolate further out from it.


> Programming is only a part of the job devs are doing.

Programming is a huge part of the job. In a world where AI does the programming we're going to need 80% fewer software professionals.

It won't be a full replacement of the role, you're correct there - but it'll be a major downsizing because of productivity gains.


Everyone wouldn't starve in a few months. There is more than enough food and I have faith it'd be given out. The starvation we see today in a world where most genuinely have a chance to get out of it is nothing like a world in which people can't earn an income.

The government only has as much power as they are given and can defend, and the only way I could see that happening is via automated weapons controlled by a few- which at this point aren't enough to stop everyone. What army is going to purge their own people? Most humans aren't psychopaths.

I think it'd end in a painful transition period of "take care of the people in a just system or we'll destroy your infrastructure".


> The government only has as much power as they are given and can defend, and the only way I could see that happening is via automated weapons controlled by a few- which at this point aren't enough to stop everyone. What army is going to purge their own people? Most humans aren't psychopaths.

I think you're right for the immediate future.

I suspect while we're still employing large numbers of humans to fight wars and to maintain peace on the streets it would be difficult for a government to implement deeply harmful policies without risking a credible revolt.

However, we should remember the military is probably one of the first places human labour will be largely mechanised.

Similarly maintaining order in the future will probably be less about recruiting human police officers and more about surveillance and data. Although I suppose the good news there is that US is somewhat of an outlier in resisting this trend.

But regardless, the trend is ultimately the same... If we are assuming that AI and robotics will reach a point where most humans are unable to find productive work, therefore we will need UBI, then we should also assume that the need for humans in the military and police will be limited. Or to put it another way, either UBI isn't needed and this isn't a problem, or it is and this is a problem.

I also don't think democracy would collapse immediately either way, but I'd be pretty confident that in a world where fewer than 10% of people are in employment and 99%+ of the wealth is being created by the government or a handful of companies it would be extremely hard to avoid corruption over the span of decades. Arguably increasing wealth concentration in the US is already corrupting democratic processes today, this can only worsen as AI continues exacerbates the trend.


It seems inevitable that costs will come down over time. Expensive models today will be cheap models in a few years.

Of course it's what they're going for. If they could do it they'd replace all human labor - unfortunately it's looking like SWE might be the easiest of the bunch.

The weirdest thing to me is how many working SWEs are actively supporting them in the mission.


The day I start freaking out about my job is the day when my non-engineer friend turned vibe coder understands how, or why the thing that AI wrote works. Or why something doesn't work exactly the way he envisioned and what does it take to get it there.

If it can replace SWEs, then there's no reason why it can't replace say, a lawyer, or any other job for that matter. If it can't, then SWE is fine. If it can - well, we're all fucked either way.


> If it can replace SWEs, then there's no reason why it can't replace say, a lawyer

SWE is unique in that for part of the job it's possible to set up automated verification for correct output - so you can train a model to be better at it. I don't think that exists in law or even most other work.


What is the automated verification of correct output and who defines that?

But before verification, what IS correct output?

I understand SWE process is unique in that there are some automations that verify some inputs and outputs, but this reasoning falls into the same fallacies that we've had before AI era. First one that comes to mind is that 100% code coverage in tests means that software is perfect.


Right, and that's why it's only part of the job. The benchmarks they're currently doing compose of the AI being handed a detailed spec + tests to make pass which isn't really what developing a feature looks like.

Going from fuzzy under-defined spec to something well defined isn't solved.

Going from well defined spec to verification criteria also isn't.

Once those are in place though, we get https://vinext.io - which from what I understand they largely vibe-coded by using NextJS's test suite.

> First one that comes to mind is that 100% code coverage in tests means that software is perfect

I agree.. but I'm also not sure if software needs to be perfect


Enthusiastically supporting them. It’s quite depressing to watch over the last few years. It’s not like they’re being coy about their aim…

Agree. Anthrophic in particular have been quite clear in what they are trying to do. Every blog post about every new model almost dismisses every other use case other than coding - every other use case seems almost a footnote in their communication.

Our DNA does contain our pre-training, though. It's not true that we're an entirely blank slate.

Pre-training is not a good term if you are trying to compare it to LLM pre-training. Closer would be the model's architecture and learning algorithms which has been designed through decades of PhD research, and my point on that is that the differences are still much greater than the similarities.

The difference here is that everyone else in this product category are also sprinting full steam ahead trying to get as many users as they can

If they DIDN'T heavily vibe-code it they might fall behind. Speed of implementation short term might beat out long-term maintenance and iteration they'd get from quality code

They're just taking on massive tech debt


> If they DIDN'T heavily vibe-code it they might fall behind

For you and I, sure - sprint as fast as we can using whatever means we can find. But when you have infinite money, hiring a solid team of traditional/acoustic/human devs is a negligible cost in money and time.

Especially if you give those devs enough agency that they can build on the product in interesting and novel ways that the ai isn’t going to suggest.

Everything is becoming slop now, and it almost always shows. I get why when you’re resource constrained. I don’t get why when you’re not.


> Everything is becoming slop now, and it almost always shows. I get why when you’re resource constrained. I don’t get why when you’re not

Every dollar spent is a dollar that shareholders can't have and executives can't hope for in their bonuses


> it doesn't really matter in the end

if you have one of the top models in a disruptive new product category where everyone else is sprinting also, sure..


Code quality never really mattered to users of the software. You can have the most <whatever metric you care about> code and still have zero users or have high user frustration from users that you do have.

Code quality only matters in maintainability to developers. IMO it's a very subjective metric


It's not subjective at all. It's not art.

Code quality = less bugs long term.

Code quality = faster iteration and easier maintenance.

If things are bad enough it becomes borderline impossible to add features.

Users absolutely care about these things.


Okay, but I meant how you measure is subjective.

How do you measure code quality?

> Users absolutely care about these things.

No, users care about you adding new features, not in your ability to add new features or how much it cost you to add features.


99.999999% of products can't get away with what Anthropic is able to - this is a one in a billion disruptive product with minimal competition, and its success so far is mostly due to Claude the model, not the agent harness

Strange, even

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You