The more efficient tokenizer reduces usage by representing text more efficiently with fewer tokens. But the lack of transparancy does indeed mean Anthropic could still scale down limits to account for that.
Given the alleged recent extreme reduction in Claude Code usage limits (https://news.ycombinator.com/item?id=47739260), how do these more autonomous tools work within that constraint? Are they effectively only usable with a 20x Max plan?
EDIT: This comment is apparently [dead] and idk why.
I've been talking to friends about this extensively, and read all sorts of different social media posts on X where people deep dove things (I'm at work so I don't have any links handy - though I did submit one on HN, grain of salt, unsure how valid it is but it was interesting: https://news.ycombinator.com/item?id=47752049 ).
I think the real issue stems from the 1 Million token context window change. They did not anticipate the amount of load it would give you. That first few days after they released the new token window, I was making amazing things in one single session from nothing, to something (a new .NET based programming language inspired by Python, and a Virtual Actor framework in Rust). I think since then they've been trying too many things to tweak things, whilst irritating their users.
They even added a new "Max" thinking mode, and made "High" the old medium, which is ridiculous because you think you're using "High" but really you're not. There's a hidden config file to change their terrible defaults to let Claude be smarter still, and apparently you can toggle off the 1M tokens.
I think the real fix, and I'm surprised nobody there has done this yet, is to let the user trim down their context window.
Think about it, you used to have what? 350k tokens or so? Now Claude will keep sending your prompt from 30 minutes ago that's completely irrelevant to the back-end, whereas 3 months ago it would have been compacted by now.
Others have noted that similar prompting for some ungodly reason adds tens of thousands of extra garbage tokens (not sure why).
Edit looks like someone figured out that if you downgrade your version of Claude Code and change one single setting it unruins Claude:
Here is the question for which I cannot find an answer, and cannot yet afford to answer myself:
In Claude Code, I use Opus 4.6 1M, but stay under 250k via careful session management to avoid known NoLiMa [0] / context rot [1] crap. The question I keep wanting answered though: at ~165k tokens used, does Opus 1M actually deliver higher quality than Opus 200k?
NoLiMa would indicate that with a ~165k request, Opus 200k would suck, and Opus 1M would be better (as a lower percentage of the context window was used)... but they are the same model. However, there are practical inference deployment differences that could change the whole paradigm, right? I am so confused.
Anthropic says it's the same model [2]. But, Claude Code's own source treats them as distinct variants with separate routing [3]. Closest test I found [4] asserts they're identical below 200K but it never actually A/B tests, correct?
Inside Claude Code it's probably not testable, right? According to this issue [5], the CLI is non-deterministic for identical inputs, and agent sessions branch on tool-use. Would need a clean API-level test.
The API level test is what I really want to know for the Claude based features in my own apps. Is there a real benchmark for this?
I have reached the limits of my understanding on this problem. If what I am trying to say makes any sense, any help would be greatly appreciated.
If anyone could help me ask the question better, that would also be appreciated.
2 parent comments above say that you can use older version of claude code with opus 200k to compare.
my guess is that eventually you’ll be able to set it in model settings yourself
Yeah, I have been seeing lots of comments, tweets, etc, but given everything I have learned about these models - i do not think the change to 1M was innocuous. I'm not sure what they've claimed publicly, but I'm fairly certain they must be doing additional quantization, or at minimum additional quantization of the KV cache. Plus, sequence length can change things even when not fully utilized. I had to manually re-enable the "clear context and continue" feature as well.
I used the heck out of it when it was announced, and it felt like I was using one of the best models I've ever used, but then so were all of their other customers, I don't think they accounted for such heavy load, or maybe follow up changes goofed something up, not sure. Like I said, the 1M token, for the first few days allowed me to bust out some interesting projects in one session from nothing to "oh my" in no time.
I'm thinking they should go back to all their old settings and as a user cap you at their old token limit, and ask you if you want to compact at your "soft" limit or burst for a little longer, to finish a task.
The future of harnesses cannot be „resend the whole history on every step“ or whatever this terrible compaction is.
Most of the context is unstructured fluff, much of it is distracting or even plain wrong. Especially the „thinking“ tokens are often completely disjoint halucinations that don’t make any sense.
I think what will have to happen is that context looks less like a long chat and action log and more like a structured, short, schema validated state description, plus a short log trace that only grows until a checkpoint is reached, which produces a new state.
It’s going to be crazy with the explanation they come up with why the us public has to pay to bail out AI for national security.
In a way, it’s true if china has superior AI then it’s dominance over US will materialize. But it’s not hard to see how this scenario is being used to essential lie and scam into trillions of debt.
Its interesting how the cutthroat space of big tech has manifested into an incidious hyper capitalist system where disrupting a system is it’s primary function. The system in this case is world order and western governments
It is, as I'm running it; it has been added this week. As I said I'm running the main version from Github and doing nothing special, see: https://news.ycombinator.com/item?id=47761308
I have been building/vibecoding a similar tool and unfortunately came to the conclusion that in practice, there are just too many features dependent on the full Chrome stack that it's just more pragmatic to use a real Chromium installation despite the file size. Performance/image generation speed is still fine, though.
I think you could in theory have a similar webkit-based stripped down headless crate that might have a good tradeoff of features, performance, and size.
Because people might have missed it last thread, here's dang's response to the discourse:
> I don't think I've ever seen a thread this bad on Hacker News. The number of commenters justifying violence, or saying they "don't condone violence" and then doing exactly that, is sickening and makes me want to find something else to do with my life—something as far away from this as I can get. I feel ashamed of this community.
> Edit: for anyone wondering (or hoping), no I'm not leaving. That was a momentary expression of dismay.
I recently saw a lecture by neuroscientist Robert Sapolsky [1] which discussed the complexities of human violence. We both condone and don't condone violence all the time, depending on social context. And furthers, our ways of expressing violence are varied (even down to tiny things like the silent treatment). We (along with other animals) have always used aggression to enforce social order and obtain social benefit.
Perhaps something to think about in a scenario like this. Personally I think it's interesting that some people are so quick to condone aggressive attacks on powerful people, yet have no comment on those powerful people committing lower levels of violence against the masses. It's all social context.
> Perhaps something to think about in a scenario like this. Personally I think it's interesting that some people are so quick to condone aggressive attacks on powerful people, yet have no comment on those powerful people committing lower levels of violence against the masses. It's all social context.
Can I just say that out of all of this discourse happening, this might be the most insightful yet succint position to explain my stance on all of this especially the "its all social context." line.
I feel like many of us here might share an answer publicly but I have always believed that if I am in the shoes of someone else, I might act the way they do so in a sense I understand the human part of it. A human did the violence and why. I understand that. Now we can call this violence inhuman, sure, but this action is still done by human and for many reasons. And I also understand why people condemn these actions, we wish to live in a clean and structural world and then we see the messiness of the world.
I just feel like just condemning an action would do nothing unless we change the ground conditions but that isn't in the hands of even many of us Hackernews users and this is basically a class aspect to it.
I personally feel like there are some similarities to this incident to the Trolley problem actually. Vsauce did a video about it worth watching[0]
I believe that unique community of HN consist mostly of individuals that weren't able to fully understand those elements of human nature as elementary (and sometimes high-school) schoolers. I stand as one example of such person, it took me about 30 years before I understood that I lacked such innate understanding at school.
How is a person from a nation that the US President has threatened to annex or invade supposed to feel about seeing domestic violence in the United States? From their perspective a divided United States is less of a personal threat to them.
All this talk about how 'we can't have this in a democracy!' forgets that many of us don't live in that particular democracy, and that particular democracy is threatening other democracies.
What should my response be if a North Korean General is executed? Or if a Russian oligarch 'falls out a window'? Or a corrupt Mexican politician is beheaded by a rival cartel?
These American oligarchs aren't my countrymen, They don't have my best interests in mind, they fund the people who threaten my country, and now they provide the American military with technology that it can use to attack my country.
Their lobbying and campaign contributes have resulted in a Mad King waging an unwinnable war that has severely damaged the global economy and has made my life demonstrably worse. I have never done anything to these people and yet they callously did this to all of us for personal profit well beyond what any human being could never need in a thousand life times.
At the end of the day the less cohesive the American tribe is the better off my tribe is. I wish our incentives were aligned but they just aren't and I am not in any way responsible for that.
It is fascinating to me that this was the thing that dang thinks is the most violent in the forum's history.
Not people advocating for hundreds of thousands of unnecessary deaths from covid. Not people advocating for bombing campaigns blowing children to smithereens. Not people advocating for mass cuts to programs treating people with tuberculosis. Not people advocating for mass cuts to programs feeding the starving. Not people defending ICE in murdering people either via gunshot or medical neglect in their disgusting prisons.
In fact, a lot discussion critical of that stuff just gets [flagged].
None of that counts as violence to dang. But threaten a billionaire? Oh that's a bridge too far.
It would be extremely difficult to have politics discussion without condoning violence. Deciding what sorts of violence is ok is an inherent part of politics. In practice, there's no way to ban calls for violence without banning the discussion of wide swaths of political topics.
It is unnecessary, and it was an obvious offense, not defense. Of course it is "bad". We (Trump) need(s) to stop creating wars and fucking up the economy, while killing others. It is bad all the way down.
I agree with the idea that calls for violence are bad; however most people in the world are more than happy to support both violence and calls for same against people and organizations they believe to be sufficiently significant threats.
Are calls for violence against Hitler during WW2 bad? How about the Japanese imperial navy?
How about calls for violence against Putin during his war of aggression?
This isn’t rhetoric; I’m just pointing out that it isn’t as black and white as people seem to make it. (It is black and white for me, as I’m with Asimov on the matter, but it isn’t for most humans.)
If you can't think of a single occurrence in history that directly disproves your proposed guideline, it's time to drop whatever you're doing and study history.
If you can think of one, then you shouldn't be proposing introduction of guidelines that are blatantly false. Or would you like a "1+1 is not 2" guideline to accompany it?
Are calls for violence bad when you're calling for throwing a molotov cocktail at a child? At an adult? At a serial killer? At someone who's about to shoot you unprovoked? At someone who murdered your family? At someone who's about to?
If you said "yes" to all of the above, I'd love to know your reasoning.
Are the two choices "accept that violence is unconditionally bad" and "throw a molotov cocktail at Sam Altman's house"? Because that dichotomy seems a bit... false?
Sure, I care about certain people more than others and I’d be willing to use violence to defend myself or my family. But that’s not the same as cheering on or advocating for an attack on someone else that may or may not have done something to harm someone totally unrelated to you.
I think they have it set so that only followers can respond. Prevents most of the horrible stuff, but also downgrades you on the X algorithm. At least there are no weirdo's on the other social media platforms :-)
reply