More

giwook · 2026-04-09T18:24:43 1775759083

Do you mind elaborating on your experience here?

Just curious as I've often heard that Claude was superior for planning/architecture work while ChatGPT was superior for actual implementation and finding bugs.

patates · 2026-04-09T19:19:02 1775762342

Claude makes more detailed plans that seem better if you just skim them, but when analyzed, has a lot of errors, usually.

It compensates for most during implementation if you make it use TDD by using superpower et al, or just telling it to do so.

GPT 5.4 makes more simple plans (compared to superpowers - a plugin from the official claude plugin marketplace - not the plan mode), but can better fill the details while implementing.

Plan mode in Claude Code got much better in the last months, but the lacking details cannot be compensated by the model during the implementation.

So my workflow has been:

Make claude plan with superpowers:brainstorm, review the spec, make updates, give the spec to gpt, usually to witness grave errors found by gpt, spec gets updates, another manual review, (many iterations later), final spec is written, write the plan, gpt finds mind boggling errors, (many iterations later), claude agent swarm implements, gpt finds even more errors, I find errors, fix fix fix, manual code review and red tests from me, tests get fixed (many iterations later) finally something usable with stylistic issues at most (human opinion)!

This happens with the most complex features that'd be a nightmare to implement even for the most experienced programmers of course. For basic things, most SOTA modals can one-shot anyway.

giwook · 2026-04-09T20:28:18 1775766498

Interesting. Have you ever had Claude re-review its plan after having it draft the original plan? Or do you give it to GPT right away to review?

Just curious as I'm trying to branch out from using Claude for everything, and I've been following a somewhat similar workflow to yours, except just having Claude review and re-review its plan (sometimes using different roles, e.g. system architect vs SWE vs QA eng) and it will similarly identify issues that it missed originally.

But now I'm curious to try this while weaving in more GPT.

giwook · 2026-04-09T18:23:27 1775759007

I use both GH Copilot as well as CC extensively and it does seem more economical, though I wonder how long this will last as I imagine Github has also been subsidizing LLM usage extensively.

FWIW it feels like GH Copilot is a cheaper version of OpenRouter but with trade-offs like being locked into VSCode and the Microsoft ecosystem overall. I already use VSCode though and otherwise I don't see much downside to using GH Copilot outside of that.

treesknees · 2026-04-09T18:32:40 1775759560

You’re not locked into vscode. There are plugins for other IDEs, and a ‘copilot’ cli tool very similar to Claude Code’s cli tool.

I also wouldn’t say you’re locked into Microsoft’s ecosystem. At work we just have skills that allow for interaction with Bitbucket and other internal tooling. You’re not forced to use GitHub at all.

WithinReason · 2026-04-09T19:13:55 1775762035

You can use copilot models from OpenCode:

https://github.blog/changelog/2026-01-16-github-copilot-now-...

satvikpendem · 2026-04-09T18:27:33 1775759253

I'm hopeful because Microsoft already has a partnership and owns much of OpenAI so can get their models at cost to host on Azure with they already do, so they can pass on the savings to the user. This is why Opus is 3x as expensive in Copilot, because Microsoft needs to buy API usage from Anthropic directly.

treesknees · 2026-04-09T18:36:48 1775759808

I don’t think it’s API costs. Their Sonnet 4.6 is just 1x premium request which matches the 1x cost of the various GPT Codex models.

satvikpendem · 2026-04-09T18:39:41 1775759981

Sonnet is the worse model though, therefore it's expected that it is cheaper, the comparison would be Opus and GPT. That Anthropic's worse model is the same request cost as the best OpenAI model is what I mean when talking about Microsoft flexing their partnership.

sassymuffinz · 2026-04-09T18:48:43 1775760523

You could use something like [https://opencode.ai](OpenCode) which supports integration with Copilot.

lossyalgo · 2026-04-09T19:02:08 1775761328

> but with trade-offs like being locked into VSCode and the Microsoft ecosystem overall

You can use GH Copilot with most of Jetbrains IDEs.

giwook · 2026-04-09T18:21:30 1775758890

Just to clarify, one does not get access to the pro model on the Pro plan?

carbocation · 2026-04-09T18:22:47 1775758967

The $20 Plus plan still exists, and does not give access to the pro model.

The $200 Pro plan still exists, and does give access to the pro model.

What is new is a $100 Pro plan that does give access to the pro model, with lower usage limits than the $200 Pro plan.

dimmke · 2026-04-09T18:31:27 1775759487

This is still worse than Anthropic's right? Because you get access to their top model even at the $20 price point

Tiberium · 2026-04-09T18:34:22 1775759662

It's not worse, Anthropic simply has no equivalent model (if you don't consider Mythos) of GPT 5.4 Pro. Google does though: Gemini 3.1 Deep Think.

GPT 5.4 Pro is extremely slow but thorough, so it's not meant for the usual agentic work, rather for research or solving hard bugs/math problems when you provide it all the context.

giwook · 2026-04-09T20:25:04 1775766304

I'm genuinely asking, when you say Gemini 3.1 DT is an equivalent model of GPT 5.4 Pro, is there a specific benchmark/comparison you're referring to or is this more anecdotal?

And do you mean to say that you don't really use GPT 5.4 Pro unless it's for a hard bug? Curious which models you use for system design/architecture/planning vs execution of a plan/design.

TIA! I'm still trying to figure out an optimal system for leveraging all of the LLMs available to us as I've just been throwing 100% of my work at Claude Code in recent months but would like to branch out.

simianwords · 2026-04-09T20:31:10 1775766670

Pro and DT model are equivalents because

- internally same architecture of best of N

- not available in the code harness like Codex, only in the UI (gpt has API)

- GPT-5.4 pro is extremely expensive: $30.00 input vs $180.00 output

- both DT and Pro are really good at solving math problems

irishcoffee · 2026-04-09T18:25:46 1775759146

So, reading the tea leaves, they're either losing subscribers for the $200 plan, or they're not following the same hockey stick path of growth they thought they were... maybe?

Edit: I wonder if this is actually compute-bound as the impetus

tedsanders · 2026-04-09T18:35:40 1775759740

Nope, it's just that a lot of people (especially those using Codex) asked us for a medium-sized $100 plan. $20 felt too restrictive and $200 felt like a big jump.

Pricing strategy is always a bit of an art, without a perfect optimum for everyone:

- pay-per-token makes every query feel stressful

- a single plan overcharges light users and annoyingly blocks heavy users

- a zillion plans are confusing / annoying to navigate and change

This change mostly just adds a medium-sized plan for people doing medium-sized amounts of work. People were asking for this, and we're happy to deliver.

(I work at OpenAI.)

aryehof · 2026-04-10T05:22:51 1775798571

Did you modify the Plus plans usage recently or as part of this introduction? Given that Pro plans usage are multiples of it (5x/20x) and given reports of less Plus usage, clarification would be appreciated?

Transparency on this sort of thing is the best way to address negative company sentiment.

tedsanders · 2026-04-10T06:13:48 1775801628

I'm honestly not sure, as I don't work on it. My understanding from afar is:

- There was a 2x promotion in March that ended on April 2, so limits have felt tighter since then

- We sometimes reset rate limits after bugs or milestones or because Tibo feels generous, which can make some days feel different than others (they are typically announced here: https://x.com/thsottiaux)

- Recently Plus was tweaked to have a smaller 5h limit but an increased weekly limit

- Lastly, as part of the new Pro launch, the $100 & $200 Pro tiers are getting a 2x promotion, meaning they are temporarily 10x/40x instead of 5x/20x

I've asked our team to clarify the pricing page. Agree it's not clear.

irishcoffee · 2026-04-09T20:54:15 1775768055

Thanks for the response. I tried to phrase my postulations as just that, I didn’t intend to be an accusatory.

You like the job? How’s the day-to-day go? Yanking tickets or more organic?

tedsanders · 2026-04-09T22:53:32 1775775212

All good, I interpreted it as postulation and not accusation. :)

I do like the job! Much more organic than yanking tickets, though I'm on the model training side of things, rather than product side. Always a balance between short-term sprints patching bad behaviors for the next model vs long-term investments in infra and science that make future work easier. Sometimes the negative press gets to me a bit (it's a very different feeling than 2022 or 2023), but my goal is just to make the most useful product I can for people. It's been wild how much Codex has already changed my day-to-day work, I'm so curious to see what it looks like in 2030 or 2040.

alyxya · 2026-04-09T18:31:18 1775759478

Plenty of people wanted to spend more than $20 but less than $200 for a plan. It's long overdue IMO.

patates · 2026-04-09T18:23:15 1775758995

Plus plan doesn't get the pro model, which is (AFAICT) the same 5.4 model but thinks like a lot.

jgalt212 · 2026-04-09T18:31:30 1775759490

You're trying to make words mean what we all think they mean. Stop foisting your Textualism upon us!

giwook · 2026-04-07T15:18:49 1775575129

LOL telepathy!

It's actually via quantum entanglement.

giwook · 2026-04-07T15:18:05 1775575085

And then some vibe code reviewing.

giwook · 2026-04-07T14:39:12 1775572752

> My prima facie view on Altman has been that he presents as sincere.

That is how pathological liars present.

Lerc · 2026-04-08T00:38:39 1775608719

What kind of situation would I choose to use the word presents in that context without being aware of that fact?

I am also aware that sincere people present that way.

I don't believe there is any rational way to consider the appearance of innocence as evidence of guilt.

giwook · 2026-04-08T13:36:47 1775655407

There is a ton of evidence out there that points to guilt. No one implied the appearance of innocence was evidence of guilt (as much as I admire the creativity in your interpretation, Mr. Self-Described Altman Apologist).

Lerc · 2026-04-08T21:13:57 1775682837

Making a selective quote the way you did with the response you provided made my interpretation reasonable.

What other point could you have been making? You made no reference to any other evidence.

>as much as I admire the creativity in your interpretation, Mr. Self-Described Altman Apologist).

I am unsure if this is deliberate irony, or poor comprehension.

giwook · 2026-04-09T14:46:04 1775745964

Use the multitude of search tools within your grasp. It is difficult to avoid the evidence.

It may be more of a mental block than anything else.

giwook · 2026-04-07T01:09:23 1775524163

There may be a reason why Altman is talked about a lot. This article in particular surfaces real information and new perspectives we've not heard in this level of detail before on some pretty significant topics that will be impacting you, me, and pretty much everyone we know not only today but well into the future.

You have a point in that Anthropic deserves some coverage too and that there are interesting perspectives that we've not heard of on that front either.

But just because that's true doesn't mean this article isn't very much relevant and needed.

Because it is.

freely0085 · 2026-04-07T01:35:04 1775525704

The New Yorker has given plenty of coverage about Anthropic in their past issues earlier this year.

giwook · 2026-04-06T23:05:04 1775516704

Any plans to tackle any of the other folks who might be mentioned in the same sentence as Altman, like Darius Amodei?

mathisfun123 · 2026-04-06T23:08:59 1775516939

[flagged]

yakkomajuri · 2026-04-06T23:21:08 1775517668

I think the comment was out of legitimate interest rather than weighing one against the other

giwook · 2026-04-07T14:35:04 1775572504

giwook · 2026-04-07T01:04:33 1775523873

Huh? It's a genuine question. The article is great and the writer did a fantastic job.

Please try to give people the benefit of the doubt though I know it's hard in today's society.

giwook · 2026-04-06T23:00:51 1775516451

tl;dr

No, he cannot.

giwook · 2026-04-06T20:37:33 1775507853

Pretty sure it's still gone and you should be using effort level now for this.

xvector · 2026-04-07T05:34:57 1775540097

No, ultrathink is back and it's the same thing as high effort for the message in which it is included

svnt · 2026-04-07T14:47:23 1775573243

Right but wasn’t high effort the default effort before? So ultrathink is gone in all but name.

HN For You