More

rurp · 2026-04-10T23:48:22 1775864902

Reach out to investors that you or your friends/family have an existing relationship with. If that's not an option your odds of being funded are slim. Consider going to Stanford or networking with them another way; the latter being much easier if you're already wealthy.

This might sound like a joke or overly cynical but I'm being totally serious. Merit and product quality are only very loosely correlated with funding success at this stage.

The vast majority of projects don't get VC funded and of the small number that do most have some sort of relationship or other in into the funding world.

This doesn't mean you can't get funded, but it's a huge longshot. If you already have an income and some savings consider bootstrapping the project yourself, at least until you have some traction in the market.

conartist6 · 2026-04-11T14:24:18 1775917458

I appreciate the advice, but I'm a couple steps ahead of you. I spent 10 years mastering my trade (UI) working in San Francisco and Palo Alto and Menlo Park, and that's when I saved enough money to allow myself to work on this for the last 5 years.

And fortunately I don't need anyone's permission. It's just too late to stop the wave of change that's coming now. After 5 decades, the punchcard is finally going to be retired as the primitive at the heart of all programming.

rurp · 2026-04-09T21:00:21 1775768421

LLMs are and will be used as malware, propaganda, and slop generation agents more than they will be used as debugging agents. The amount of energy that we'll need to consume going forward just to defend against malicious users and to filter down the flood of slop is absolutely eye watering and will continue to grow as far as we can tell.

rurp · 2026-04-09T20:09:11 1775765351

> almost the entirety of their valuation is predicated on near-instantaneous robotaxi rollout and a near monopoly on humanoids and neither of those are going to happen.

I also expect their robotaxis and robots to be a joke business-wise for the foreseeable future, but I totally disagree that they have much to do with the stock price. The real driver is Elon's immense cult following. The Tesla stock price has always been absurd relative the actual business, but cult followers don't care about financial statements, or engineering work, or business logic. They will ignore all of that to support their leader.

Elon has been spewing blatant lies for well over a decade at this point and always claims to be chasing some shiny object a year out that will transform the business and grow its revenue to match the valuation. It never happens, but people don't care.

He has lost some supporters from his peak due to his political and social media insanity, but most of the diehards have kept by him. The odds of him diving back into MAGA politics in a visible way is a larger stock price risk than continuously declining sales, but even then he would have to go to extreme lengths to shake off most of the people who have stuck behind him for this long.

rurp · 2026-04-09T16:13:45 1775751225

How does this address the most common case where many people were harmed a modest amount? Causing $100 of harm to a million people is a huge amount of damage that should be punished, but nobody is going to launch a full independent lawsuit for $100.

rurp · 2026-04-09T16:10:24 1775751024

Yeah, glad to see Zuck is sticking with those strong free speech principles he couldn't wait to get back to last year.

alex1138 · 2026-04-09T16:46:39 1775753199

Free speech which apparently includes https://news.ycombinator.com/item?id=42651178

I actually am more at odds with HN than many people might be because I think the lies surrounding covid and the censorship were absolutely wrong and platforms could genuinely after things like that lay claim to being unfairly directed, but you can tell Zuck doesn't actually care because he immediately started doing that

rurp · 2026-04-08T19:51:43 1775677903

What's the incentive for a ship captain to risk this? Even if they're more confident than almost everyone else that it's a bluff and think there's a 95% chance Iran does nothing, a 5% chance of you and your crew being incinerated is a crazy risk to take.

Would you go to your normal job tomorrow if someone who has a history of carrying out threats has threatened to kill you for it?

kilgoresalmon · 2026-04-08T20:08:20 1775678900

You can't imagine someone who would go to that job simply because the owner hired a bouncer and they have a different faith in authority or really mean looking bouncers than you?

I can spend 10 minutes looking at demographics and tell you the world is not explainable if the measuring stick is my own risk tolerances.

rurp · 2026-04-07T15:54:26 1775577266

HN is a big community that has always had a mix of people who value newness as a feature vs those who prioritize simplicity and reliability. Unless you're recognizing the exact same names taking these contradictory opinions it's probably different groups of people for the most part.

It seems like every LLM thread for the past couple years is full of posts saying that the latest hot AI tool/approach has made them unbelievably more productive, followed by others saying they found that same thing underwhelming.

echelon · 2026-04-07T16:05:42 1775577942

> I get it. LLMs are cool technology.

I don't think many of you have legitimately tried Claude Code, or maybe you're holding it wrong.

I'm getting 10x the work done. I'm operating at all layers of the stack with a speed and rapidity I've never had before.

And before anyone accuses me of being some "vibe coder", I've built five nines active-active money rails that move billions of dollars a day at 50kqps+, amongst lots of other hard hitting platform engineering work. Serious senior engineering for over a decade.

This isn't just a "cool technology". We've exited the punch card phase. And that is hard or impossible to come back from.

If you're not seeing these same successes, I legitimately think you're using it wrong.

I honestly don't like subscription services, hyperscaler concentration of power, or the fact I can't run Opus locally. But it doesn't matter - the tool exists in the shape it does, and I have to consume it in the way that it's presented. I hope for a different offering that is more democratic and open, but right now the market hasn't provided that.

It's as if you got access to fiber or broadband and were asked to go back to ISDN/dial up.

nerptastic · 2026-04-07T16:25:48 1775579148

Man I really thought this was satire. It’s phenomenal that you can gain 10x benefits at all layers of the stack, you must have a very small development team or work alone.

I just don’t see how I could export 10x the work and have it properly validated by peers at this point in time. I may be able to generate code 10-20x faster, but there are nuances that only a human can reason about in my particular sector.

suzzer99 · 2026-04-07T17:21:19 1775582479

Senior engineer with 25 years of experience here. I wish I spent enough time actually coding that 10x-ing my coding productivity would matter much to my job. Most of my day is spent wrangling requirements, looking after junior devs, stamping out confusion brush fires before they get out of control, and generally just trying to steer the app away from a trainwreck down the line.

When I do code, it's almost always something novel that I don't know how I'm going to implement until I code a few pieces and see how they fit together. If it's a fairly routine feature based on an existing pattern, I assign it to one of the other devs.

shuntress · 2026-04-07T21:58:04 1775599084

This is basically the thing I keep coming back to with the agentic tools. It is the wrangling requirements, stamping out confusion, and steering away from a trainwreck down the line that are the actual challenging parts of the job and we can't automate those yet. Once you do actually know the code change you want to make though it is pretty nice to change it 10x faster than before.

hsuduebc2 · 2026-04-07T16:34:14 1775579654

I noticed that too. At start. It vaguely reminded me of the famous Navy SEAL copypasta.

seanw444 · 2026-04-07T17:36:33 1775583393

What the fuck did you just fucking say about me, you little bitch?

hsuduebc2 · 2026-04-07T22:36:40 1775601400

I see some confusion from your downvotes. Sean cited the start of "navy seal copypasta"

The Navy Seal copypasta is a internet meme consisting of an absurdly over the top tough guy threat, posted to mock someone who is acting somehow aggressive, insecure, or self important online, usually after a minor argument.

Here it's full text:

>What the fuck did you just fucking say about me, you little bitch? I'll have you know I graduated top of my class in the Navy Seals, and I've been involved in numerous secret raids on Al-Quaeda, and I have over 300 confirmed kills. I am trained in gorilla warfare and I'm the top sniper in the entire US armed forces. You are nothing to me but just another target. I will wipe you the fuck out with precision the likes of which has never been seen before on this Earth, mark my fucking words. You think you can get away with saying that shit to me over the Internet? Think again, fucker. As we speak I am contacting my secret network of spies across the USA and your IP is being traced right now so you better prepare for the storm, maggot. The storm that wipes out the pathetic little thing you call your life. You're fucking dead, kid. I can be anywhere, anytime, and I can kill you in over seven hundred ways, and that's just with my bare hands. Not only am I extensively trained in unarmed combat, but I have access to the entire arsenal of the United States Marine Corps and I will use it to its full extent to wipe your miserable ass off the face of the continent, you little shit. If only you could have known what unholy retribution your little "clever" comment was about to bring down upon you, maybe you would have held your fucking tongue. But you couldn't, you didn't, and now you're paying the price, you goddamn idiot. I will shit fury all over you and you will drown in it. You're fucking dead, kiddo.

Aurornis · 2026-04-07T17:35:59 1775583359

> I just don’t see how I could export 10x the work and have it properly validated by peers at this point in time.

In my experience, the people who 10X their output with Claude Code fit one of two categories:

1. They're not really taking the time to understand the code they're submitting. They might do a skim over the output and see that it looks reasonable and passes tests, but they aren't taking time to understand the code as if they were pair programming. Only when it breaks and the LLM can't patch it up quickly do they go in and fully understand the code.

2. They moved very slowly before Claude Code. I've had some coworkers who would take 2-3 days to get a simple PR out because, to be frank, their work days weren't full of a lot of work. Every time they'd run into a question they'd stop and then bumble around for a few hours until they could talk to the ticket creator about it. They'd get tired of working on a task by 2PM and then save the rest of the work for tomorrow. They'd get an idea and decide to rewrite the PR the next day, and on and on with distractions. When they start using Claude Code the LLM doesn't have the same holdups, so now every time where they were getting stuck or tired before is replaced by an LLM powering through to some solution. Their cognitive load is reduced so they're no longer freezing up during the day. They aren't really becoming 10X engineers like they think, but really just catching up to normal pace

rockostrich · 2026-04-07T17:53:26 1775584406

I don't know if we're all 10x'ing but our entire org is shipping PRs using an in-house framework akin to Stripe's Minions [1] and many of those PRs are generated from Slack. We definitely have work to do on the latter part of the SDLC to have more confidence in these changes but we can still rely on the existing observability layer to make sure things are working as expected.

Another commenter mentioned that Docker, git, etc. were all tools that greatly enhanced productivity and coding agents are just another tool that does that. I would agree, but argue that it's more impactful than all of those tools combined.

[1] https://stripe.dev/blog/minions-stripes-one-shot-end-to-end-...

shuntress · 2026-04-07T22:01:28 1775599288

Regarding point #2, while it is of course entirely possible that they are slackers it is more likely that they lack the knowledge you are leveraging in order to declare that the PRs are "simple"

deely3 · 2026-04-07T20:48:32 1775594912

It's simpler actually: author trying to make a business developing AI product.

AlexeyBelov · 2026-04-09T06:23:58 1775715838

Yes, and he's shilling in almost every thread. This is tiring.

nothinkjustai · 2026-04-07T17:29:46 1775582986

It is satire! They have been doing this bit for a while and people keep falling for it lol

Aurornis · 2026-04-07T17:22:54 1775582574

I use Claude Code a lot, but I don't understand these "I'm doing 10X the work" comments.

I spend a lot of time reviewing any code that comes out of Claude Code. Even using Opus 4.6 with max effort there is almost always something that needs to be changed, often dramatically.

I can see how people go down the path of thinking "Wow, this code compiles and passes my tests! Ship it!" and start handing trust over to Opus, but I've already seen what this turns into 6 months down the road: Projects get mired down in so much complexity and LLM spaghetti that the codebase becomes fragile. Everyone is sidetracked restructuring messy code from the past, then fighting bugs that appear in the change.

I can believe some of the more recent studies showing LLMs can accelerate work by circa 20% (1.2X) because that's on the same order of magnitude that I and others are seeing with careful use.

When someone comes out and claims 10X more output, I simply cannot believe they're doing careful engineering work instead of just shipping the output after a cursory glance.

tracker1 · 2026-04-07T18:33:14 1775586794

I find that it's relative to the amount of planning time you spend... I feel like I've gotten around 5x the output while using Claude Code w/ Opus over what I will get done myself... That said, I'm probably spending about 3x as much time planning as I would when just straight coding for/by myself. And that's generally the difference.

I can use the agent to scaffold a lot of test/demo frameworks around the pieces I'm working on pretty cleanly and have the agent fill in. I still spend a lot of time validating the tests and the code being completed though.

The errors I tend to get from the agent are roughly similar to what I might see from a developer/team that works remotely... you still need to verify. The difference is the turn around seems to be minutes over days. You're also able to observe over simply review... When I see a bad path, I can usually abort/cancel, revert back to the last commit and try again with more planning.

Pay08 · 2026-04-07T21:53:22 1775598802

That's part of why I don't get AI for directly writing code at all. If I am going to be reviewing anything that comes out of it (and I will) then I might as well just write it myself. It's easier and faster, although it does also make it easier to fall victim to blind spots.

eijkene · 2026-04-07T23:24:42 1775604282

I think there’s a subset of engineers who were never all that good (we have no idea how many there are) who benefit most from llm’s.

We should also keep in mind there’s always been an insane shortage of high quality devs. So I’m not surprised with what we seeing.

But this notion that an elite dev is seeing 10x productivity gain is absolute nonsense. LLM’s hold experts back in most contexts.

dandellion · 2026-04-07T16:29:48 1775579388

You must be using it wrong, because I'm getting 100x the work done and currently at 1.5 million MRR with this SAAS I vibe coded over the weekend.

After I solved entrepreneurship I decided to retire and I now spend my days reading HN, posting on topics about AI.

darth_aardvark · 2026-04-07T16:40:00 1775580000

You're still manually posting? All of my HN posting, trolling, shitposting and spamming is taken care of by a fleet of bots I vibecoded in the last 5 minutes.

slowmovintarget · 2026-04-07T17:28:16 1775582896

You gest, but I know people who've done this.

"I gotta be present." Me: Reenacting the Malcolm Reynolds too many responses meme.

xantronix · 2026-04-07T17:01:10 1775581270

Mind if I use this as a copypasta for the future? This checks off every point people bring on LinkedIn and elsewhere.

In all seriousness though, writing code, or even sitting down and properly architecting things, have never been bottlenecks for me. It has either been artificial deadlines preventing me from writing proper unit tests, or the requirement for code review from people on my team who don't even work on the same codebase as I do on a daily basis. I have often stated and stand by the assertion that I develop at the speed of my own understanding, and I think that is a good virtue to carry forth that I think will stand the test of time and bring about the best organisational outcomes. It's just a matter of finding the right place that values this approach.

Edit for context: My team is an ops team that needed a couple developers; I was picked to implement some internal tooling. The deadlines I was given for the initial development are tied directly to my performance evaluation. My boss has only ever been a manager for almost two years. He has only ever had development headcount for less than a year. He has never been on a development team himself. The man does not take breaks and micromanages at every opportunity he gets. He is paranoid for his job, thinking he is going to be imminently replaced by our (cheaper) EU counterparts. His management style and verbal admonitions reflect this; he frequently projects these insecurities onto others, using unnecessarily accusatory speech. I am not the only developer on my team who has had such interactions with him. I have screenshots of conversations with him that I felt necessary to present to a therapist. This degree of time pressure is entirely unprecedented in my 20 year career. Yes, this is a dysfunctional environment.

mikebenfield · 2026-04-07T17:18:25 1775582305

> artificial deadlines preventing me from writing proper unit tests, or the requirement for code review from people on my team who don't even work on the same codebase as I do on a daily basis

I have never experienced this, and it sounds remarkably dysfunctional to me.

xantronix · 2026-04-07T17:55:39 1775584539

Believe me, it is very dysfunctional. As I've mentioned to your first replyer, my boss has only had developers for less than a year. This is an operations team I was assigned to in order to provide them some much needed tooling. The pressure my boss has perceived from above has led to my own significant burnout. The guy does not take days off and has always been logged into Slack on the odd hours I would need to pull up some HR form or another. I am currently off work for several months dealing with the fallout from all that.

I've tried everything I can to cope and am not sure I will be willing to return to that team once I am past my medical leave.

lijok · 2026-04-07T17:40:17 1775583617

[flagged]

xantronix · 2026-04-07T17:50:26 1775584226

Beg pardon? I've been doing this for 20 years. My boss has been a boss for two years and has only had developer headcount for less than a year. This degree of pressure is unprecedented in my career.

lijok · 2026-04-07T19:21:24 1775589684

[flagged]

xantronix · 2026-04-07T19:37:28 1775590648

Please explain, I'd like to have a productive dialogue about this. I assume you are referring to my boss?

dwaltrip · 2026-04-07T17:04:56 1775581496

It’d be cool to see your process in depth. You should record some of your sessions :)

I mostly believe you. I have seen hints of what you are talking about.

But often times I feel like I’m on the right track but I’m actually just spinning when wheels and the AI is just happily going along with it.

Or I’m getting too deep on something and I’m caught up in the loop, becoming ungrounded from the reality of the code and the specific problem.

If I notice that and am not too tired, I can reel it back in and re-ground things. Take a step back and make sure we are on reasonable path.

But I’m realizing it can be surprisingly difficult to catch that loop early sometimes. At least for me.

I’ve also done some pretty awesome shit with it that either would have never happened or taken far longer without AI — easily 5x-10x in many cases. It’s all quite fascinating.

Much to learn. This idea is forming for me that developing good “AI discipline” is incredibly important.

P.s. sometimes I also get this weird feeling of “AI exhaustion”. Where the thought of sending another prompt feels quite painful. The last week I’ve felt that a lot.

P.p.s. And then of course this doesn’t even touch on maintaining code quality over time. The “after” part when the LLM implements something. There are lots of good patterns and approaches for handling this, but it’s a distinct phase of the process with lots of complexities and nuances. And it’s oh-so-temping to skip or postpone. More so if the AI output is larger — exactly when you need it most.

ericmcer · 2026-04-07T16:22:41 1775578961

I mean at this point can we just conclude that there are a group of engineers who claim to have incredible success with it and a group that claim it is unreliable and cannot be trusted to do complex tasks.

I struggle to believe that a ton of seemingly intelligent software engineers are too dumb to figure out how to use Claude code to get reliable results, it seems much more likely to me that it can do well at isolated tasks or new projects but fails when pointed at large complex code bases because it just... is a token predictor lol.

But yeah spinning up a green fields project in an extensively solved area (ledgers) is going to be something an AI shines at.

It isn't like we don't use this stuff also, I ask Cursor to do things 20x a day and it does something I don't like 50% of the time. Even things like pasting an error message it struggles with. How do I reconcile my actual daily experience with hype messages I see online?

rurp · 2026-04-07T16:50:00 1775580600

Right, I keep seeing people talking past each other in this same way. I don't doubt folks when they say they coded up some greenfield project 10x faster with Claude, it's clearly great at many of those tasks! But then so many of them claim that their experience should translate to every developer in every scenario, to the point of saying they must be using it wrong if they aren't having the same experience.

Many software devs work in teams on large projects where LLMs have a more nuanced value. I myself mostly work on a large project inside a large organization. Spitting out lines of code is practically never a bottleneck for me. Running a suite of agents to generate out a ton of code for my coworkers to review doesn't really solve a problem that I have. I still use Claude in other ways and find it useful, but I'm certainly not 10x more productive with it.

dns_snek · 2026-04-07T17:26:56 1775582816

> But yeah spinning up a green fields project in an extensively solved area (ledgers) is going to be something an AI shines at.

I couldn't disagree with this more. It's impressive at building demos, but asking it to build the foundation for a long-term project has been disastrous in my experience.

When you have an established project and you're asking it to color between the lines it can do that well (most of the time), but when you give it a blank canvas and a lot of autonomy it will likely end up generating crap code at a staggering pace. It becomes a constant fight against entropy where every mess you don't clean up immediately gets picked up as "the way things should be done" the next time.

Before someone asks, this is my experience with both Claude Code (Sonnet/Opus 4.6) and Codex (GPT 5.4).

hombre_fatal · 2026-04-07T16:26:30 1775579190

I suspect many people here have tried it, but they expected it to one-shot any prompt, and when it didn't, it confirmed what they wanted to be true and they responded with "hah, see?" and then washed their hands of it.

So it's not that they're too stupid. There are various motivations for this: clinging on to familiarity, resistance to what feels like yet another tool, anti-AI koolaid, earnestly underwhelmed but don't understand how much better it can be, reacting to what they perceive to be incessant cheerleading, etc.

It's kind of like anti-Javascript posts on HN 10+ years ago. These people weren't too stupid to understand how you could steelman Node.js, they just weren't curious enough to ask, and maybe it turned out they hadn't even used Javascript since "DHTML" was a term except to do $(".box").toggle().

I wish there were more curiosity on HN.

ericmcer · 2026-04-07T16:44:21 1775580261

So what do I do differently then?

Hypothetically, you have a simple slice out of bounds error because a function is getting an empty string so it does something like: `""[5]`.

Opus will add a bunch of length & nil checks to "fix" this, but the actual issue is the string should never be empty. The nil checks are just papering over a deeper issue, like you probably need a schema level check for minimum string length.

At that point do you just tell it like "no delete all that, the string should never be empty" and let it figure that out, or do I basically need to pseudo code "add a check for empty strings to this file on line 145", or do I just YOLO and know the issue is gone now so it is no longer my problem?

My bigger point is how does an LLM know that this seemingly small problem is indicative of some larger failure, like lets say this string is a `user.username` which means users can set their name to empty which means an entire migration is probably necessary. All the AI is going to do is smoosh the error messages and kick the can.

echelon · 2026-04-07T17:53:41 1775584421

1. I'm working in Rust, so it's a very safe and low-defect language. I suspect that has a tremendous amount to do with my successes. "nulls" (Option<T>) and "errors" (Result<T,E>) must be handled, and the AST encodes a tremendous amount about the state, flow, and how to deal with things. I do not feel as comfortable with Claude Code's TypeScript and React outputs - they do work, but it can be much more imprecise. And I only trust it with greenfield Python, editing existing Python code has been sloppy. The Rust experience is downright magical.

2. I architecturally describe every change I want made. I don't leave it up to the LLM to guess. My prompts might be overkill, but they result in 70-80ish% correctness in one shot. (I haven't measured this, and I'm actually curious.) I'll paste in file paths, method names, struct definitions and ask Claude for concrete changes. I'll expand "plumb foo field through the query and API layers" into as much detail as necessary. My prompts can be several paragraphs in length.

3. I don't attempt an entire change set or PR with a single prompt. I work iteratively as I would naturally work, just at a higher level and with greater and broader scope. You get a sense of what granularity and scope Claude can be effective at after a while.

You can't one shot stuff. You have to work iteratively. A single PR might be multiple round trips of incremental change. It's like being a "film director" or "pair programmer" writing code. I have exacting specifications and directions.

The power is in how fast these changes can be made and how closely they map to your expectations. And also in how little it drains your energy and focus.

This also gives me a chance to code review at every change, which means by the time I review the final PR, I've read the change set multiple times.

Chu4eeno · 2026-04-07T18:44:15 1775587455

I hope you're not 100% serious.

Otherwise you should switch to haskal since it makes logic errors and bugs mathematically impossible.

UI_at_80x24 · 2026-04-07T17:10:25 1775581825

I have encountered the exact same kind of frustration, and no amount of prompting seems to prevent it from "randomly" happening.

`the error is on line #145 fix it with XYZ and add a check that no string should ever be blank`

It's the randomness that is frustrating, and that the fix would be quicker to manually input that drives me crazy. I fear that all the "rules" I add to claude.md is wasting my available tokens it won't have enough room to process my request.

unshavedyak · 2026-04-07T17:21:46 1775582506

Yup, this is why i firmly believe true productivity, as in, it aiding you to make you faster, is limited by the speed of review.

I think Claude makes me faster, but the struggle is always centered around retaining own context and reviewing code fully. Reviewing code fully to make sure it’s correct and the way I want it, retaining my own context to speed up reviews and not get lost.

I firmly believe people who are seeing massive gains are simply ignoring x% lines of code. There’s an argument to be made for that being acceptable, but it’s a risk analysis problem currently. Not one I subscribe to.

julian37 · 2026-04-07T17:21:29 1775582489

Use planning+execution rather than one-shotting, it'll let you push back on stuff like this. I recommend brainstorming everything with https://github.com/obra/superpowers, at least to start with.

Then work on making sure the LLM has all the info it needs. In this example it sounds like perhaps your hypothetical data model would need to be better typed and/or documented.

But yeah as of today it won't pick up on smells as you do, at least not without extra skills/prompting. You'll find that comforting or annoying depending on where you stand...

hombre_fatal · 2026-04-07T18:20:18 1775586018

Always start an implementation in Claude Code plan mode. It's much more comprehensive than going straight to impl. I never read their prompt for plan mode before, but it deep-dives the code, peripheral files, callsites, documentation, existing tests, etc.

You get a better solution but also a plan file that you can review. And, also important, have another agent review. I've found that Codex is really good at reviewing plans.

I have an AGENTS.md prompt that explains that plan file review involves ranking the top findings by severity, explaining the impact, and recommending a fix to each one. And finally recommend a simpler directional pivot if one exists for the plan.

So, start the plan in Claude Code, type "Review this plan: <path>" in Codex (or another Claude Code agent), and cycle the findings back into Claude Code to refine the plan. When the plan is updated, write "Plan updated" to the reviewer agent.

You should get much better results with this capable of much better arch-level changes rather than narrow topical solutions.

If that's still not working sufficiently for you, maybe you could use more support, like a type-system and more goals in AGENTS.md?

skydhash · 2026-04-07T19:08:22 1775588902

IMO, plan mode is pretty useless. For bug fixes and small improvements, I already know where to edit (and can do it quickly with vim-fu).

For new features, I spend a bit of time thinking, and I can usually break it down in smaller tasks that are easy to code and verify. No need to wrangle with Plan mode and a big markdown file.

I can usually get things one-shotted by that point if I bother with the agent.

Izkata · 2026-04-08T16:14:21 1775664861

My manager and I have been experimenting with it for some stuff, and our most recent attempt at using plan mode was a refactor to change a data structure and make some conversion code unnecessary, then delete it. The plan looked fine, but after it ran the data structure change was incomplete, most of the conversion code was still there, and it introduced several bugs by changing lines it shouldn't have touched at all. Also removed several "why" style comments and arbitrarily changed variable names to be less clear in code it otherwise didn't change.

This was the costliest one we had access to, chosen as an experiment - took $20 over almost a half hour to run.

hombre_fatal · 2026-04-08T20:14:34 1775679274

Did you do the plan review cycles like I suggested? It's a critical point.

Plan mode gives you a plan file, then you refine that, and impl derives from it.

Also, do you know it cost $20 because you're using the Claude API? I'd definitely use a subscription for interactive/development use.

Izkata · 2026-04-08T23:26:55 1775690815

We reviewed the plan manually, asked it a few questions to clarify parts, and manually tweaked other parts.

I didn't catch what it was, some web dashboard that showed the cost per prompt. We could see it going up as it ran. We were just using the plan our company provided.

dpkirchner · 2026-04-07T17:05:18 1775581518

Not the person you're replying to but yes, sometimes I do tell the agent to remove the cruft. Then I back up a few messages in the context and reword my request. Instead of just saying "fix this crash", or whatever, I say "this is crashing because the string is empty, however it shouldn't be empty, figure out why it's empty". And I might have it add some tests to ensure that whatever code is not returning/passing along empty strings.

rattlesnakedave · 2026-04-07T16:28:14 1775579294

“I struggle to believe that a ton of seemingly intelligent software engineers are too dumb to figure out how to use Claude code to get reliable results”

Seemingly is doing the heavy lifting here. If you read enough comment threads on HN, it will become obvious why they aren’t getting results.

alwillis · 2026-04-07T17:03:44 1775581424

> I struggle to believe that a ton of seemingly intelligent software engineers are too dumb to figure out how to use Claude code to get reliable results.

They're not dumb, but I'm not surprised they're struggling.

A developer's mindset has to change when adding AI into the mix, and many developers either can’t or won’t do that. Developers whose commits that look something like "Fixed some bugs" probably aren’t going to take the time to write a decent prompt either.

Whenever there's a technology shift, there are always people who can't or won't adapt. And let's be honest, there are folks whose agenda (consciously or not) is to keep the status quo and "prove" that AI is a bad thing.

No wonder we're seeing wildly different stories about the effectiveness of coding agents.

dandellion · 2026-04-07T16:38:27 1775579907

Here's my 100 file custom scaffolding AI prompt that I've been working on for the last four months, and can reliably one-shot most math olympic problems and even a rust to do list.

rattlesnakedave · 2026-04-07T19:01:18 1775588478

Case in point

echelon · 2026-04-08T01:01:44 1775610104

Let's come back to these comments in ten years, shall we? Should be pretty entertaining.

jerf · 2026-04-07T17:50:19 1775584219

I see two basic cases for the people who are claiming it is useless at this point.

One is that they tried AI-based coding a year or two ago, came to the IMHO completely correct at that time conclusion that it was nearly useless, and have not tried it since then to see that the situation has changed. To which the solution is, try it again. It changed a lot.

The other are those who have incorporated into their personal identity that they hate AI and will never use it. I have seen people do things like fire AI at a task they have good reasons to believe it will fail at, and when it does, project that out to all tasks without letting themselves consciously realize that picking a bad task on purpose skews the deck.

To those people my solution is to encourage them to hold on to their skepticism. I try to hold on to it as well despite the incredible cognitive temptation not to. It is very useful. But at the same time... yeah, there was a step change in the past year or so. It has gotten a lot more useful...

... but a lot of that utility is in ways that don't obviate skilled senior coding skills. It likes to write scripting code without strong types. Since the last time I wrote that, I have in fact used it in a situation where there were enough strong types that it spontaneously originated some, but it still tends to write scripting code out of that context no matter what language it is working in. It is good at very straight-line solutions to code but I rarely see it suggest using databases, or event sourcing, or a message bus, or any of a lot of other things... it has a lot of Not Invented Here syndrome where it instead bashes out some minimal solution that passes the unit tests with flying colors but can't be deployed at scale. No matter how much documentation a project has it often ends up duplicating code just because the context window is only so large and it doesn't necessarily know where the duplicated code might be. There's all sorts of ways it still needs help to produce good output.

I also wonder how many people are failing to prompt it enough. Some of my prompts are basically "take this and do that and write a function to log the error", but a lot of my prompts are a screen or two of relevant context of the project, what it is we are trying to do, why the obvious solution doesn't work, here's some other code to look at, here's the relevant bugs and some Wiki documentation on the planning of the project, we should use {event sourcing/immutable trees/stored procedures/whatever}, interact with me for questions before starting anything. This is not a complete explanation of what they are doing anymore, but there's still a lot of ways in which what an LLM can really do is style transfer... it is just taking "take this and do that and write a function to log the error" and style-transforming that into source code. If you want it to do something interesting it really helps to give it enough information in the first place for the "style transfer" to get a hold of and do something with. Don't feel silly "explaining it to a computer", you're giving the function enough data to operate on.

sutib · 2026-04-07T20:43:36 1775594616

I can see huge utility with AI as a guide and helper.

But not being one leg in the code myself is not something I am comfortable with. It starts feeling like management and not development. I really feel the abdication very strongly and it makes me unable and unwilling to put a hard stamp on quality. I have seen too much hallucination or half missed requirements to put that much trust in AI.

It's the same with code reviews of hard tickets. You can scroll past and just approve, but do you really understand what your colleague has built? Are you really in the driver's seat? It feels to me like YOLOing with major consequences.

I dont but, at all that people doing 20x output have any idea what they are coding. They are just pressing the yolo button and no one, not the engineer, not the AI and not management is in the driver's seat. it is a very scary time.

xantronix · 2026-04-08T23:59:02 1775692742

I reread your comment and I think you might be sincere. To address this point:

> If you're not seeing these same successes, I legitimately think you're using it wrong.

I'm not sure how you could say that, considering I'm not using it at all. I don't want to, and I don't plan to. If that becomes an issue, I'm exiting this industry because I simply don't fucking care any longer. I am fine living the rest of my life and dying happy and sore being an automotive technician.

epistasis · 2026-04-07T16:17:01 1775578621

I'm still reviewing all the code that's created, and asking for modifications, and basically using LLMs as a 2000 wpm typist, and seeing similar productivity gains. Especially in new frameworks! Everything is test driven development, super clean and super fast.

The challenge now is how to plan architectures and codebases to get really big and really scale, without AI slop creating hidden tech debt.

Foundations of the code must be very solid, and the architecture from the start has to be right. But even redoing the architecture becomes so much faster now...

ipaddr · 2026-04-07T16:50:11 1775580611

I'm getting 1,000x improvement building notepad applications with 6 9s. No one is faster.

Need some help selling these notepad apps, do you have a prompt for that?

RaftPeople · 2026-04-07T19:06:13 1775588773

I don't want to sound like I'm trying to one-up you, but I've basically vibe coded the entire internet.

I'm surprised nobody thought of it before me but basically the LLM's are trained on the internet and I just had it spit back out everything.

It's running in parallel so I can validate it, which of course I'm using LLM's to do that.

Once it's ready I will put it on the market, but get this, my internet will be cheaper than the current internet. I'll probably just make it one cheaper, like if the current internet costs, for example, 7, I'll make my internet cost 6.

embedding-shape · 2026-04-07T16:13:33 1775578413

> and I have to consume it in the way that it's presented

I'm just curious, why do you "have to"? Don't get me wrong, I'm making the same choice myself too, realizing a bunch of global drawbacks because of my local/personal preference, but I won't claim I have to, it's a choice I'm making because I'm lazy.

wongarsu · 2026-04-07T16:30:27 1775579427

What are the reasonable options besides a Claude Code subscription (or an equivalent from Codex or Copilot)?

I could pay API prices for the same models, but aside from paying much more for the same result that doesn't seem helpful

I could pay a 4-5 figure sum for hardware to run a far inferior open model

I could pay a six figure sum for hardware to run an open model that's only a couple months behind in capability (or a 4-5 figure sum to run the same model at a snail's pace)

I could pay API costs to semi-trustworthy inference provider to run one of those open models

None of those seem like great alternatives. If I want cutting-edge coding performance then a subscription is the most reasonable option

Note that this applies mostly to coding. For many other tasks local models or paid inference on open models is very reasonable. But for coding that last bit of performance matters

prabal97 · 2026-04-07T17:13:22 1775582002

I use my OAI subscription on my Claude Code. I get the benefit of the Claude Code interface with the intelligence of OAI models.

https://prabal.ca/posts/claude-code-chatgpt-subscription/

echelon · 2026-04-07T16:20:03 1775578803

My job title is "provide value".

I'm given a tool that lets me 10x "provide value".

My personal preferences and tastes literally do not matter.

embedding-shape · 2026-04-07T16:24:03 1775579043

As a professional you have a choice in how you produce whatever it is you produce. Sure, you can go for the simplest, most expensive and "easiest" way of doing things, or you can do other things, depending on your perspective and requirements. None of this is set in stone, some people make choices based on personal preferences, and that matters as much to them as your choices matter to you.

skydhash · 2026-04-07T18:58:35 1775588315

> If you're not seeing these same successes, I legitimately think you're using it wrong.

What is “using it right”? You wrote claims, but explain nothing about your process. Anything not reproducible is either luck or lie.

blurbleblurble · 2026-04-07T16:13:52 1775578432

> fact I can't run Opus locally

Yet

britzkopf · 2026-04-07T16:28:53 1775579333

> And before anyone accuses me of being some "vibe coder", I've built five nines active-active money rails that move billions of dollars a day at 50kqps+, amongst lots of other hard hitting platform engineering work. Serious senior engineering for over a decade

You sound like a pro wrestler. I'd like to know what "hard-hitting" engineering work is. Hydraulic hammers?

dmoy · 2026-04-07T16:48:46 1775580526

I mean five nines is legitimately difficult to accomplish for a lot of problem spaces.

It's also like.... difficult to honestly and accurately measure. And account for whether or not you're getting lucky based on your underlying dependencies (servers, etc) not crashing as much as advertised, or if it's actually five nines. Or whether you've run it for a month and gotten <30s of measure downtime and declared victory, vs run it for three years with copious software updates.

I always assume most people claiming five nines are just not measuring it correctly, or have not exercised the full set of things that will go wrong over a long enough period of time (dc failures, network partitions, config errors, bad network switches that drop only UDP traffic on certain ports, erroneous ACL changes, bad software updates, etc etc)

Maybe they did it all correct though, in which case, yea, seems hard hitting to me.

sutib · 2026-04-07T20:46:58 1775594818

5 nines is at best a temporary achievement, given enough time.

surgical_fire · 2026-04-07T17:16:35 1775582195

I read this as satire. I still think it is.

rurp · 2026-04-07T02:58:24 1775530704

This would be great but I'm sure the entrenched players will make it difficult enough to run effective local models that normal users won't touch them.

There are only two OS options for phones and computers for 99+% of people and it will be trivially easy to restrict local models on them.

rurp · 2026-04-06T18:32:28 1775500348

Very interesting project! Can anyone comment on what the buying process is like? Specifically if there are any weird hoops to jump through or if it's a normal account signup and payment process. Is delivery available or do these need to be picked up in person?

player_piano · 2026-04-06T18:39:40 1775500780

It varies by state and authority. The majority require in-person pickup; for properties there is often longer, sometimes multi-stage bidding process (where the auctioneer periodically reviews current bids, and decides whether any are acceptable before moving to the next stage).

lazyasciiart · 2026-04-06T18:34:57 1775500497

The ones I’ve looked at need to be picked up in person, sometimes with very short deadlines.

lazyasciiart · 2026-04-06T19:52:27 1775505147

e.g. from one auction: Removal Responsibilities: The successful bidder is solely responsible for all aspects of removal, including packing, crating, banding, loading, and shipping. The agency will not provide assistance.

Authorized Third-Party Removal: The authorized third-party agent must present a Letter of Authorization from the high bidder (see terms and conditions for details), a copy of the purchaser's receipt, and a valid photo ID at the time of removal.

Special Pickup Requirements: You are required to provide your last and first name along with the specific date and time to Mimi.quach@noaa.gov for pickup. This information is required to grant you access to NOAA Building 33. Building hours are Monday through Friday, 8:00 AM to 3:00 PM.

Loading Assistance: Staff will be available to help load the item onto your vehicle.

Pickup within 15 business days.

mahoneycutt · 2026-04-07T00:06:15 1775520375

There are deals to be found on these auction sites. However, unless you are prepared to visit the site where the auction items are physically stored, it is a bit of a "pig in a poke" situation. In my case, I estimate between 5% and 10% of the items I buy are defective (with no refunds) so bid accordingly.

rurp · 2026-04-03T19:55:58 1775246158

This is what bringing democracy looks like?! The regime is more entrenched than ever and our commander in chief keeps threatening to commit war crimes on a massive scale. If he follows through on what he says he will do and obliterates all the civilian infrastructure in the country it will kill mass numbers of innocent people and turn millions of survivors into impoverished refugees.

As bad as the regime is, and it's very bad, what we're doing is even worse for most Iranians and the odds a democratic government arises from the ashes of our bombing campaign is incredibly unlikely.

HN For You