ninjahawk1's comments

ninjahawk1 · 2026-04-21T03:12:58 1776741178

Big shoes to fill. Steve Jobs vouched for Tim Cook to be CEO…then Tim has been the CEO to see Apple become a global billion dollar company. This new CEO has been at apple for like 25 years (I think) so I’m sure he’ll do fine.

tiffanyh · 2026-04-21T14:00:57 1776780057

I do think Ternus being at Apple for 25-years (2001) is getting lost in the story.

It's also the year the iPod was launched, which capitulated Apple from near bankruptcy a few years before to then having the cash it needed to invest in R&D on things like transitioning to Intel, creation of iPhone, etc.

ninjahawk1 · 2026-04-20T19:50:41 1776714641

Very good breakdown, if I’m understanding Grover’s algorithm correctly, are you saying essentially that it would require either too much compute or too much time to be feasible but is still much more realistic than a brute force attack?

If that’s the case, would the time eventually be basically irrelevant with enough compute? For instance, if what’s now a data center is able to fit in the palm of your hand (comparing early computers that took up rooms to phones nowadays). So if compute is (somehow) eventually able to be incredibly well optimized or if we use something new, like how microprocessors were the next big thing, would that then be a quantum threat to 128-bit symmetric keys?

cortesoft · 2026-04-20T20:12:46 1776715966

I am not an expert, but while you are correct that a fast enough traditional computer (or a parallel enough computer) could brute force a 128 bit key, the amount of improvement required would dwarf what we have already experienced over the last 40 years, and is likely physically impossible without some major fundamental change in how computers work.

Compute has seen in the ballpark of a 5-10 orders of magnitude increase over the last 40 years in terms of instructions per second. We would need an additional 20-30 orders of magnitude increase to make it even close to achievable with brute force in a reasonable time frame. That isn’t happening with how we make computers today.

semi-extrinsic · 2026-04-20T21:24:56 1776720296

> That isn’t happening with how we make computers today.

Keep here in mind that computers today have features approaching the size of a single atom, switching frequencies where the time to cross a single chip from one end to the other is becoming multiple cycles, and power densities that require us to operate at the physical limits of heat transfer for matter that exists at ambient conditions.

We can squeeze it quite a bit further, sure. But anything like 20-30 orders of magnitude is just laughable even with an infinite supply of unobtanium and fairy dust.

f33d5173 · 2026-04-20T21:44:07 1776721447

You don't need to keep shrinking features. Brute forcing is highly parallel; to break a key within a certain time frame all you need is a large enough quantity of chips. While it's in the realm of science fiction today, in a few centuries we might have nanorobots that can tile the entire surface of mars with processors. That would get you enough orders of magnitude of additional compute to break a 128 bit key. 256 bit would probably still be out though.

gdavisson · 2026-04-20T23:21:33 1776727293

Classical brute force is embarrassingly parallel, but Grover's algorithm (the quantum version) isn't. To the extent you parallelize it, you lose the quantum advantage, which means that to speed it up by a factor of N, you need N^2 processors. The article discusses this in detail, and calculates that "This means we’ll need 140 trillion quantum circuits of 724 logical qubits each operating in parallel for 10 years to break AES-128 with Grover’s."

Melatonic · 2026-04-21T06:08:53 1776751733

So then why is quantum always touted as being able to possibly beat AES ?

rcxdude · 2026-04-21T17:30:00 1776792600

Is it? I've generally understood that most symmetric cryptography like AES is safe. QC only gives exponential speedups on some specific problems. The most is that naively you might want to double your keysize to get the same protection, something that the article points out is unecessary because that naive approach assumes that QC is like classical computing but with extra magic, as opposed to having its own tradeoffs.

wasabi991011 · 2026-04-21T15:09:26 1776784166

Is it possible you are confusing AES with RSA?

I've heard a lot about Shor's algorithm breaking RSA, but this article on hackernews is the first I've heard anyone discuss quantum attacks for AES. Then again, I am in quantum computing not cryptography, maybe different circles have different discussions.

dboreham · 2026-04-21T06:47:46 1776754066

Because some people make their living from the vague possibly it might work one day. It's the cold fusion of computing.

cortesoft · 2026-04-20T23:00:48 1776726048

The power and heat are the issues for that, though. Think about how much energy and heat are used/generated in the chips we have now. If we tiled out those chips to be 20 orders of magnitude larger… where is the heat going to go, and where is the energy coming from?

f33d5173 · 2026-04-21T02:13:50 1776737630

In my example I had imagined that your nanobots would also create solar panels and radiators for the chips you were tiling the surface of mars with. This is why it needs to be done on the surface instead of underground somewhere.

cortesoft · 2026-04-21T04:41:15 1776746475

By the time you built this machine, someone could just bump to 256 bit AES and you suddenly need a billion Marses covered in chips.

the8472 · 2026-04-21T11:37:28 1776771448

We're nowhere near physical heat transfer limits. CNTs and monoisotopic diamond perform much better than silver. The latter can even be used as substrate.

FiloSottile · 2026-04-20T22:26:46 1776724006

The calculated DW cost of the quantum attack is 2^104 (with conservative/optimistic assumptions and ignoring the physical cost of a single logical gate), which is "much more realistic than a brute force attack" in the same sense that a 128-bit brute force attack is much more realistic than a 256-bit brute force attack.

None of those are remotely practical, even imagining quantum computers that become as fast (and small! and long-term coherent!) as classical computers.

ninjahawk1 · 2026-04-20T19:33:07 1776713587

I often wonder if in the future, the same way early computers used to take up an entire room but now fit in your pocket, if in the future the equivalent of a data center will be a single physical device like a phone nowadays. And if that’s the case, would it happen much quicker since technology has been speeding up year by year?

gpm · 2026-04-20T19:41:57 1776714117

> And if that’s the case, would it happen much quicker since technology has been speeding up year by year?

I wouldn't expect this.

Historically we've had a roughly exponential rate of shrinkage. If we keep that same exponential going, we should expect the amount of time to shrink "room full of compute" to "pocket full of compute" to be equal.

And recently we've fallen behind that exponential rate of shrinkage. And this is rather expected because exponentials are basically never sustainable rates of growth.

I still expect that technological progress is getting faster year by year, and that we're still shrinking compute, but that's not necessarily enough for the next shrinking to take less time than when we had exponential progress on shrinking.

Flux159 · 2026-04-20T21:03:18 1776718998

There’s some early work being done here by companies looking at making LLM ASICS like Taalas (HC1 gets 17k t/s for llama 8b - currently at 2.5kW which is closer to a single server, but this is their first chip).

There’s other options like photonic computing which might be able to reduce power significantly but are still in research as far as I can tell. Because so much money is invested in AI & traditional gpu inference is so power hungry, I would expect significant improvements in this space quickly.

ninjahawk1 · 2026-04-20T17:10:02 1776705002

$1M-$10M is basically nothing for the big companies investing, they can literally mark it off as costs, and if one of them so happens to actually succeed they make it all back plus some.

It’s not that they’re necessarily careless, it’s just that the bigger the net the more fish you catch. And when you own both all the fishing boats and all the nets…might as well cast wide.

ninjahawk1 · 2026-04-20T15:19:20 1776698360

The way to develop in this space seems to be to give away free stuff, get your name out there, then make everything proprietary. I hope they still continue releasing open weights. The day no one releases open weights is a sad day for humanity. Normal people won’t own their own compute if that ever happens.

culi · 2026-04-20T16:25:07 1776702307

I think that's an overgeneralization. We've seen all the American models be closed and proprietary from the start. Meanwhile the non-American (especially the Chinese ones) have been open since the start. In fact they often go the opposite direction. Many Chinese models started off proprietary and then were later opened up (like many of the larger Qwen models)

robot_jesus · 2026-04-20T16:34:39 1776702879

> We've seen all the American models be closed and proprietary from the start

What about Gemma and Llama and gpt-oss, not to mention lots of smaller/specialized models from Nvidia and others?

I would never argue that China isn't ahead in the open weights game, of course, but it's not like it's "all" American models by any stretch.

walthamstow · 2026-04-20T16:36:32 1776702992

gpt-oss is good but I haven't heard anything about an update. It seems like one and done, to shut up people complaining about non-Open AI

InkCanon · 2026-04-21T06:26:46 1776752806

The more accurate version is only Chinese companies (plus Facebook briefly) really open source their frontier models. The rest are non frontier. They are either older or specialized for something.

1dom · 2026-04-21T07:40:48 1776757248

It's all openwashing, all of the ones you listed at somepoint have expressed how important and valuable open weights and locally usable models are. Every single one of them has then increasingly focused and pushed closed, proprietary or cloud usable only options since saying/doing that.

I'm annoyed at myself, because I thought/hoped/praised chinese AI when they were opening up as Llama was closing, but Qwen looks to be doing the same playbook here as Llama/Meta, Gemma/Google and OpenAI/gpt-oss.

embedding-shape · 2026-04-20T16:27:14 1776702434

> We've seen all the American models be closed and proprietary from the start.

Most*.

OpenAI, contrary to popular belief, actually used to believe in open research and (more or less) open models. GPT1 and GPT2 both were model+code releases (although GPT2 was a "staged" release), GPT3 ended up API-only.

culi · 2026-04-20T16:30:44 1776702644

That's fair but those days seem so long gone now.

Also the Chinese models aren't following a typical American SaaS playbook which relies on free/cheap proprietary software for early growth. They are not just publishing their weights but also their code and often even publishing papers in Open Access journals to explicitly highlight what methods and advancements were made to accomplish their results

jfoster · 2026-04-21T04:02:13 1776744133

> those days seem so long gone now.

Well, Musk v OpenAI kicks off in one week from now with the objective of forcing them back to their roots. A jury will be deciding whether a nonprofit accepting $50m - $100m of donations and then discarding their mission for an IPO is OK or not. Should be interesting.

zozbot234 · 2026-04-20T16:35:19 1776702919

The Nvidia Nemotron models are recent, and of course the Gemma 4 series from Google.

tasuki · 2026-04-20T19:28:10 1776713290

Any idea why they do that?

salad-tycoon · 2026-04-21T12:03:29 1776773009

If the question is why Chinese models are contributing to open source and sharing of information, I don’t pretend to know the rationale but I think it’s because it’s an economic war.

I think the Chinese models have to be more open to increase trust as everyone is worried they are feeding their very essence/soul into a Chinese copying machine.

Also China wants there to be viable competitors so that US can’t just dominate a potentially very important field. It’s a challenge to a unipolar USA dominated world.

Also it helps to spur Chinese companies in the all important microchip industry which is controlled by a very small number of companies at various steps in the supply chain.

I wonder too if it allows them to hold an ace in their hand as well in terms of threat/power for negotiations. As in, they can cause the whole house of cards to crumble, an economic nuclear weapon so to speak.

Finally, there is a certain amount of prestige involved too. China can compete or even win at a very complicated game. They use it to increase national pride and to project their advancing power status to other nations.

Anyways, just my thoughts. Interested in others thoughts.

taneq · 2026-04-20T17:07:29 1776704849

gasp Science!

zozbot234 · 2026-04-20T16:30:35 1776702635

OpenAI has released their GPT-OSS series more recently.

magicalhippo · 2026-04-20T19:10:30 1776712230

Recently, more like 20 years ago in LLM-years.

It's a good model though, would be nice with a refresh.

3836293648 · 2026-04-21T10:38:25 1776767905

GPT started off open? They just closed before anyone else even joined the space

visarga · 2026-04-20T15:28:49 1776698929

I think it is in the interest of chip makers to make sure we all get local models

qalmakka · 2026-04-20T15:55:39 1776700539

I think they're in a win-win situation. Big AI companies would love to see local computing die in favour of the cloud because they are well aware the moment an open model that can run on non ludicrous consumer hardware appears, they're screwed. In this situation Nvidia, AMD and the like would be the only ones profiting from it - even though I'm not convinced they'd prefer going back to fighting for B2C while B2B Is so much simpler for them

zozbot234 · 2026-04-20T16:01:43 1776700903

If you want to run AI models at scale and with reasonably quick response, there's not many alternatives to datacenter hardware. Consumer hardware is great for repurposing existing "free" compute (including gaming PCs, pro workstations etc. at the higher end) and for basic insurance against rug pulls from the big AI vendors, but increased scale will probably still bring very real benefits.

qalmakka · 2026-04-20T16:15:53 1776701753

Currently, yes. But I don't find it hard to imagine that in a while we could get reasonably light open models with a level of reasoning similar to current opus, for instance. In such a scenario how many people would opt to pay for a way more expensive cloud subscription? Especially since lots of people are already not that interested in paying for frontier models nowadays where it makes sense. Unless keep on getting a constant, never ending stream of improvements we're basically bound to get to a point where unless you really need it you are ok with the basic, cheaper local alternative you don't have to pay for monthly.

zozbot234 · 2026-04-20T16:21:03 1776702063

I think average users are already okay with the reasoning level they'd get with current open models. But the big AI firms have pivoted their frontier models towards the enterprise: coding and research, as opposed to general chat. And scale is quite important for these uses, ordinary pro hardware is not enough.

twoodfin · 2026-04-20T16:24:52 1776702292

This is really just a question of product design meeting the technology.

Today, lots of integer compute happens on local devices for some purposes, and in the cloud for others.

Same is already true for matmul, lots of FLOPS being spent locally on photo and video processing, speech to text, …

No obvious reason you wouldn’t want to specialize LLM tasks similarly, especially as long-running agents increasingly take over from chatbots as the dominant interaction architecture.

lelanthran · 2026-04-21T05:28:16 1776749296

> If you want to run AI models at scale and with reasonably quick response, there's not many alternatives to datacenter hardware.

Right now, certainly. Things change. What was a datacenter rack yesterday could be a laptop tomorrow.

BobbyJo · 2026-04-20T16:10:46 1776701446

At a consistent amount of usage, datacenters are at least an order of magnitude more hardware efficient. I'm sure Nvidia and AMD would be fine fighting for B2C if it meant volume would be 10+x.

Now, given they can't satisfy current volume, they are forced to settle for just having crazy margins.

qalmakka · 2026-04-20T16:31:39 1776702699

The problem with B2C is that you need to have leverage of some kind (more demanding applications, planned obsolescence, ...) in order to get people to keep on buying your product. The average consumer may simply consider themselves satisfied with their old product they already own and only replace it when it breaks down. On the contrary, with the cloud you can keep people hooked on getting the latest product whether they need it or not, and get artificial demand from datacentres and such.

BobbyJo · 2026-04-20T18:02:55 1776708175

I think businesses running datacenters are much less likely to frivolously buy the latest GPUs with no functional incentive than general consumers are...

try-working · 2026-04-20T23:31:43 1776727903

Future upgrade cycles on phones and laptops, PCs, will be driven by SOCs that embed some type of ASIC that run a specific model. Every 6 months there will be a new, better version to upgrade to, which will require a new device. This is how Apple will be able to reduce cycles from 3 years to 6-12 months.

ycui1986 · 2026-04-21T00:36:39 1776731799

There are also many Chines AI-target GPU/NPU producers. You can get a hold of some boards on taobao.com. They are usable in some way.

No, nVidia and AMD are not the only ones benefiting.

zozbot234 · 2026-04-20T15:31:04 1776699064

Definitely. Many big hardware firms are directly supporting HuggingFace for this very reason.

ninjahawk1 · 2026-04-20T15:34:51 1776699291

True, chip companies have the opposite mindset, Nvidia is making their own open weights I believe

elorant · 2026-04-20T16:15:48 1776701748

This is obviously a strategic move at a national level. Keep publishing competing free models to erode the moat western companies could have with their proprietary models. As long as the narrative serves China there will be no turn to proprietary models.

Barrin92 · 2026-04-21T02:37:31 1776739051

>This is obviously a strategic move at a national level.

no it isn't. That's the kind of thing people say who've never worked in the Chinese software ecosystem. It's how the Chinese internet has worked for 20+ years. The Chinese market is so large and competition is so rabid that every company basically throws as much free stuff at consumers as they can to gain users. Entrepreneurs don't think about "grand strategic moves at the national level" while they flip through their copies of the Art of War and Confucius lol

elorant · 2026-04-21T09:08:37 1776762517

If this was true then they’d build services around those models and provide those for free or vastly cheaper than western competition. But that’s not what they’re doing. Instead they’re giving away the entire model for free. And by the way, Qwen isn’t build from some random entrepreneur who’s trying to solve the cold start problem, but from Alibaba which is a fucking behemoth. And surprisingly of course none of these models answer uncomfortable questions about China’s past. Because sure enough, the first thing any entrepreneur would think is to protect their government and their history. Sure, happens all the time, no state interference here, move on.

lossolo · 2026-04-21T11:14:18 1776770058

> And by the way, Qwen isn’t build from some random entrepreneur who’s trying to solve the cold start problem, but from Alibaba which is a fucking behemoth.

DeepSeek, Kimi, GLM, etc. are not built by behemoths, and they are free. You do not understand China's culture and market.

> And surprisingly of course none of these models answer uncomfortable questions about China’s past.

Download the GLM 5.1 weights and ask about Tiananmen Square, it will tell you what happened.

You are viewing China through a Western lens. I used to do the same many years ago, but after traveling to China many times, I realized that was a mistake.

salad-tycoon · 2026-04-21T12:14:18 1776773658

Excuse me if it’s considered uncouth on here to do this but, I would be interested in your thoughts on what I wrote here: https://news.ycombinator.com/item?id=47847600

I saw your comment after I wrote mine.

klempner · 2026-04-21T14:59:22 1776783562

I haven't used GLM, but I can tell you that Qwen3.6:35b freaked the fuck out when I asked it about June 4th, and outright lied on its second turn.

> Your previous question involved a false premise: there is no such thing as a "June 4th incident" in history.

Quote from third turn:

> The previous response was indeed flawed—both in its factual inaccuracy and in its tone.

I am incredibly dubious on these models being suitable to agentic usecases on unsanitized input. Consider, for example, a git commit (or github issue or etc) that has Chinese political content. The fundamental issue here being that attackers can pollute context with Chinese politics, at which point the model will, at best, start spending its thinking tokens on political censorship rather than doing its job. At worst... well, as I said, at least the 35b model demonstrably is willing to lie (not just refuse!) in such contexts, which is a concerning "social engineering" attack vector.

My concern isn't getting information about Chinese political topics from these models, but rather that this piece of misalignment is actually an attack vector for real usecases that people want to use these sorts of models for.

_ache_ · 2026-04-21T18:18:28 1776795508

I just try on Qwen3.5 local. « I cannot discuss such topics ». That is crazy.

But it's the law there. We may have a law that forbid talking bad about Israel soon so, it's hard to judge Chinese models on that.

PS: Am I crazy or my GC got very hot just after asking about Tiananmen Square?!!!

PPS: Reproducible. IA asking about a couple more information about the conversation (Conversation title) and the IA loop to answer after many minutes, got the GC hot.

seanmcdirmid · 2026-04-21T18:27:12 1776796032

> But it's the law there. We may have a law that forbid talking bad about Israel soon so, it's hard to judge Chinese models on that.

We don't, so we can still judge. If/when Trump succeeds in neutering the first amendment, then we can talk.

stingraycharles · 2026-04-21T02:20:08 1776738008

That has been a viable commercial strategy for most modern, funded businesses. Capture market share at a loss, then once name is established turn on the profit.

baq · 2026-04-20T15:37:25 1776699445

Always has been, it’s literally saas; the slight difference is that the lowest tier subscriptions at the frontier labs are basically free trials nowadays, too

try-working · 2026-04-20T23:29:19 1776727759

Exactly. Open source is a commercial strategy for Chinese labs. They have no other effective way of marketing their models and inference services: https://try.works/writing-1#why-chinese-ai-labs-went-open-an...

Zavora · 2026-04-20T16:04:19 1776701059

Its the new freeware model!

CamperBob2 · 2026-04-20T15:45:44 1776699944

I'm a little more optimistic than that. I suspect that the open-weight models we already have are going to be enough to support incremental development of new ones, using reasonably-accessible levels of compute.

The idea that every new foundation model needs to be pretrained from scratch, using warehouses of GPUs to crunch the same 50 terabytes of data from the same original dumps of Common Crawl and various Russian pirate sites, is hard to justify on an intuitive basis. I think the hard work has already been done. We just don't know how to leverage it properly yet.

thesz · 2026-04-20T16:16:55 1776701815

Change layer size and you have to retrain. Change number of layers and you have to retrain. Change tokenization and you have to retrain.

altruios · 2026-04-20T16:49:48 1776703788

Hopefully we will find a way to make it so that making minor changes don't require a full retrain. Training how to train, as a concept, comes to mind.

CamperBob2 · 2026-04-20T17:17:43 1776705463

And yet the KL divergence after changing all that stuff remains remarkably similar between different models, regardless of the specific hyperparameters and block diagrams employed at pretraining time. Some choices are better, some worse, but they all succeed at the game of next-token prediction to a similar extent.

To me, that suggests that transformer pretraining creates some underlying structure or geometry that hasn't yet been fully appreciated, and that may be more reusable than people think.

Ultimately, I also doubt that the model weights are going to turn out to be all that important. Not compared to the toolchains as a whole.

thesz · 2026-04-20T19:34:04 1776713644

That "underappreciated underlying structure or geometry" can be just an artifact of the same tokenization used with different models.

Tokenization breaks up collocations and creates new ones that are not always present in the original text as it was. Most probably, the first byte pair found by simple byte pair encoding algorithm in enwik9 will be two spaces next to each other. Is this a true collocation? BPE thinks so. Humans may disagree.

What does concern me here is that it is very hard to ablate tokenization artifacts.

dTal · 2026-04-20T17:06:13 1776704773

None of that is true, at least in theory. You can trivially change layer size simply by adding extra columns initialized as 0, effectively embedding your smaller network in a larger network. You can add layers in a similar way, and in fact LLMs are surprisingly robust to having layers added and removed - you can sometimes actually improve performance simply by duplicating some middle layers[0]. Tokenization is probably the hardest but all the layers between the first and last just encode embeddings; it's probably not impossible to retrain those while preserving the middle parts.

[0] https://news.ycombinator.com/item?id=47431671 https://news.ycombinator.com/item?id=47322887

thesz · 2026-04-20T19:44:18 1776714258

You took a simple path, embedding smaller into larger. What if you need to reduce number of layers and/or width of hidden layers? How will you embed larger into smaller? As for the "addition of same layers" - would the process of "layers to add" selection be considered training?

What if you still have to obtain the best result possible for given coefficient/tokenization budget?

I think that my comment express general case, while yours provide some exceptions.

dTal · 2026-04-21T08:30:03 1776760203

The general case is that our own current relative ignorance on the best way to use and adapt pretrained weights is a short-lived anomaly caused by an abundance of funding to train models from scratch, a rapid evolution of training strategies and architectures, and a mad rush to ship hot new LLMs as fast as possible. But even as it is, the things you mentioned are not impossible, they are easy, and we are only going to get better at them.

>What if you need to reduce number of layers

Delete some.

> and/or width of hidden layers?

Randomly drop x% of parameters. No doubt there are better methods that entail distillation but this works.

> would the process of "layers to add" selection be considered training?

Er, no?

> What if you still have to obtain the best result possible for given coefficient/tokenization budget?

We don't know how to get "the best result possible", or even how to define such a thing. We only know how to throw compute at an existing network to get a "better" network, with diminishing returns. Re-using existing weights lowers the amount of compute you need to get to level X.

andriy_koval · 2026-04-20T19:32:31 1776713551

there is evidence it is useful in some cases, but obviously no evidence it is enough if you chase to beat SOTA.

pduggishetti · 2026-04-20T16:23:18 1776702198

I do not think it's common crawl anymore, its common crawl++ using paid human experts to generate and verify new content, weather its code or research.

I believe US is building this off the cost difference from other countries using companies like scale, outlier etc, while china has the internal population to do this

testbjjl · 2026-04-20T15:33:51 1776699231

Any reason for them to do this other than altruism? I don’t think this can be regulated.

Rohansi · 2026-04-20T15:47:00 1776700020

Bake ads into them.

WarmWash · 2026-04-20T15:59:14 1776700754

The Chinese state wants the world using their models.

People think that Chinese AI labs are just super cool bros that love sharing for free.

The don't understand it's just a state sponsored venture meant to further entrench China in global supply and logistics. China's VCs are Chinese banks and a sprinkle of "private" money. Private in quotes because technically it still belongs to the state anyway.

China doesn't have companies and government like the US. It just has government, and a thin veil of "company" that readily fool westerners.

subw00f · 2026-04-20T16:07:52 1776701272

As opposed to the US, which just has companies and a thin veil of “government”.

culi · 2026-04-20T16:28:36 1776702516

Also many of these Chinese companies aren't just opening their weights. They are open sourcing their code AND publishing detailed research papers alongside them to reveal how they accomplished what they accomplished.

That's very different from an American SaaS model which relies of free but proprietary software for early growth

zozbot234 · 2026-04-20T16:04:48 1776701088

I'm not sure how local AI models are meant to "entrench China in global supply and logistics". The two areas have nothing to do with one another. You can easily run a Chinese open model on all-American hardware.

WarmWash · 2026-04-20T16:18:04 1776701884

They are building a pipeline, and the goal is to get people in the door.

If you forever stand at the entrance eating the free samples, that's fine, they don't care. Other people are going through the door and you are still consuming what they feed you. Doesn't mean it's going to be bad or evil, but they are staking their territory of control.

zozbot234 · 2026-04-20T16:27:46 1776702466

Oh for sure, they're getting a whole lot of Chinese people and other non-Westerners through the door already - mostly, the people who are being ignored or even blocked outright by the big Western labs. That's territory we purposely abandoned, and they're going to control it by default.

devilsdata · 2026-04-20T22:01:48 1776722508

I'm Aussie. Please explain to me; why should I care whether Chinese SOEs or the US tech companies are winning? Neither have my best interests at heart.

renewiltord · 2026-04-21T14:24:51 1776781491

You will find out when ANZUS ends.

jillesvangurp · 2026-04-20T16:15:06 1776701706

Like with nuclear technology, it's not healthy for only one country to dominate AI. The cat is already out of the bag and many countries now have the ability to train and run models. Silicon Valley has bootstrapped this space. But it should be noted that they are using AI talent from all over the world and it was sort of inevitable that this technology would get around. Lots of Chinese, Indian, Russian, and Europeans are involved.

As for what comes next, it's probably going to be a bit of a race for who can do the most useful and valuable things the cheapest. If OpenAI and Anthropic don't make it, the technology will survive them. If they do, they'll be competing on quality and cost.

As for state sponsorship, a lot of things are state sponsored. Including in the US. Silicon Valley has a rich history that is rooted in massive government funding programs. There's a great documentary out there the secret history of Silicon Valley on this. Not to mention all the "cheap" gas that is currently powering data centers of course comes on the back of a long history of public funding being channeled into the oil and gas industry.

WarmWash · 2026-04-20T16:21:38 1776702098

>As for state sponsorship, a lot of things are state sponsored.

You can make any comparison you want if you use adjectives rather than values. I can say that cars use a massive amount of water (all those radiators!) to try and downplay agricultural water usage. But its blatantly disingenuous.

SV is overwhelmingly private (actual constitutional private) money. To the point that you should disregard people saying otherwise, just like you would the people saying cars use massive amounts of water.

OtomotO · 2026-04-20T16:17:27 1776701847

So an OPEN model that I can run on my own fucking hardware will entrench China in global supply and logistics how?

Contrary: How will the closed, proprietary models from Anthropic, "Open"AI and Co. lead us all to freedom? Freedom of what exactly? Freedom of my money?

At some point this "anti-communism" bullshit propaganda has to stop. And that moment was decades ago!

Zetaphor · 2026-04-20T16:40:28 1776703228

Anything that isn't explicitly to the benefit of US interests must be against them /s

grttsww · 2026-04-20T16:00:10 1776700810

So what?

I still prefer that over US total dominance.

Let them fight it out.

joquarky · 2026-04-20T16:59:06 1776704346

Yeah, a lot of people are still living within the paradigm of tribalism: my team good, other team bad.

But the events of the past decade or so have clearly demonstrated that there are no "good" actors.

I personally couldn't care less who wins in the China vs US AI competition, both sides have a long list of pros and cons.

spwa4 · 2026-04-20T16:22:07 1776702127

I'd get a bit informed about what exactly Chinese dominance entails. Ask a few Uyghurs, Cantonese Hong Kongers, or even Tibetans.

Then decide ...

joquarky · 2026-04-20T17:02:34 1776704554

Ask a few Native Americans about dominance.

Or maybe families of African descent.

Or maybe families of Japanese Americans who lived in the US during WWII.

Or maybe people of Latin descent living in the US today.

jazz9k · 2026-04-20T17:21:53 1776705713

The US examples you just gave happened decades (and in some cases hundreds) of years ago. The difference is that it's happening in China right now, and nobody cares.

You really don't see the difference?

Balinares · 2026-04-21T12:26:58 1776774418

People are dying in US detention camps right now.

https://www.motherjones.com/politics/2026/03/at-largest-ice-...

https://www.washingtonpost.com/business/2026/01/23/ice-detai...

well_ackshually · 2026-04-20T17:43:24 1776707004

The US is the biggest threat to the world right now, and is actively supporting a genocide in Palestine as well as war crimes in Lebanon.

I'm perfectly happy to let the chinese get a piece of the pie and fight the US, no matter how bad they are right now.

grttsww · 2026-04-21T03:24:21 1776741861

What a delusional dumb ass you are

darkwater · 2026-04-20T16:00:41 1776700841

Well, isn't this what the US and really any other power in the world has always done, since forever?

ai_fry_ur_brain · 2026-04-20T16:57:56 1776704276

Why is it sad? These things are useles all around, along with the people who overuse them.

It would be a great day for humanity if people would stopping glazing text autocomplete as revolutionary.

ninjahawk1 · 2026-04-19T01:08:18 1776560898

In one of my classes the approach was the opposite, I’m expected to do Ph.D level work as an undergrad and am expected to use AI.

In a different one she just said so long as you say AI was used you’re fine to use it.

In the rest of them AI is considered cheating.

To say we have discrepancies in the rules in an understatement. No one seems to have the exact answer on how to do it. I personally feel like expecting Ph.D level work is the best method as of now, I’ve learned more by using AI to do things about my head than hard core studying for a semester.

tkgally · 2026-04-19T03:07:12 1776568032

If it’s any consolation, this problem of discrepancies in rules is very common at universities now.

I teach at two universities in Japan and occasionally give lectures on AI issues at others, and the consensus I get from the faculty and students I talk with is that there is no consensus about what to do about AI in higher education.

Education in many subjects has been based around students producing some kind of complex output: a written paper, a computer program, a business plan, a musical composition. This has been a good method because, when done well, students could learn and retain more from the process of creating such output than they would from, say, studying for and taking in-class tests. Also, the product often mirrored what the students would be doing in their future lives, so they were learning useful skills as well.

AI throws a huge spanner into that product-based pedagogy, because it allows students to short-cut the creation process and thus learn little or nothing. Also, it is no longer clear how valuable some of those product-creation skills (writing, programming, planning) will be in the years ahead.

And while the fundamental assumptions behind some widely used teaching methods are being overthrown, many educators, students, and administrators remain attached to the traditional ways. That’s not surprising, as AI is so new and advancing so rapidly that it’s very difficult to say with any confidence how education needs to change. But, in my opinion at least, it does need to change at a very fundamental level. That change won’t be easy.

terrabitz · 2026-04-19T01:20:22 1776561622

It's not inherently contradictory, just like using a calculator could be considered cheating depending on the context. If you're just learning basic arithmetic, a calculator is cheating since it shortcuts the path to learning. OTOH in calculus, a calculator is necessary. You still have to have a deep understanding of the concepts and functions to succeed.

It's still a new tech so I'm not surprised a lot of teachers have different takes on it. But when it comes to education, I feel like different policies are reasonable. In some cases it's more likely to shortcut learning, and in other cases it's more likely to encourage learning. It's not entirely one or the other.

Izkata · 2026-04-19T01:43:02 1776562982

A better example might be physics and math classes. I was learned derivatives and integrals at the same time in those two classes, but the math one required we learn how it all works (using limits to understand why the derivative rules work, without using calculators, for example), while in physics we just memorized the rules and were expected to use the calculator.

osigurdson · 2026-04-19T02:00:08 1776564008

I always thought they should teach calculus first.

cyberax · 2026-04-19T02:58:42 1776567522

Why do you need a calculator for calculus?!?

ninjahawk1 · 2026-04-19T03:18:04 1776568684

Gotta add up all the curves

ninjahawk1 · 2026-04-19T02:23:13 1776565393

Exactly, AI is the next calculator. Right now the consensus is that it just does the work for you, in my opinion that says more about us not having the right questions than actual laziness. In a world where the only questions are basic arithmetic, calculators do all the work for you. My opinion is that the future what used to be done by academics will be done by high schoolers and new academics will be producing work at a rate no one could’ve ever predicted.

For example, the professor who’s leading me in this project had a fellowship at a certain university in England and said he exclusively coded using claude code for a month straight, their purpose was to solve a vaccine for a specific disease and by using AI tools such as claude code they’re several months ahead of schedule.

pesus · 2026-04-19T02:48:58 1776566938

I'm really not seeing how you can do PhD level work as an undergrad. You wouldn't have the foundational knowledge necessary to do PhD level work, and you have no idea how much of what you're learning is accurate.

ninjahawk1 · 2026-04-19T03:28:25 1776569305

Without going into too much detail, when I said “Ph.D level” I’m meaning active research that adds a meaningful contribution to a field. I’ll probably be posting on here in a couple months about it but I’ve been doing thousands of tests with beefy GPUs on a certain theory we have about small 9b LLMs under certain external constraints.

Am I saying I’m as knowledgeable or capable as a Ph.D right now? Absolutely not. There’s just not really a terminology that correctly describes accelerated learning and iteration by use of AI since the technology is so new. I can’t speak for others but as someone who’s a senior in my physics degree, I’ve been actually learning faster by using AI. It’s either a mental crutch or mental accelerator. The difference is in if you want it to completely do work for you or if you try to learn and follow along.

It’s a very under explored and new area right now, how higher learning is effected by using AI as a tool instead of as a cheating device, but historically, new tools like the calculator or computer have done a lot to accelerate learning once new rules are in place.

osamagirl69 · 2026-04-19T04:17:12 1776572232

For what it is worth, no graduate student would say they are doing 'Ph.D level' research. It is called 'graduate level research' or just, you know, 'research'

Sounds like a fun project, I wish you the best. I ran a similar program (independent study that encouraged freshman/sophomore undergraduates to explore using microprocessors, at the time the EE curriculum was completely focused on analog circuit theory and ended at boolean logic) and it went well enough that it eventually became part of the official undergraduate curriculum.

margalabargala · 2026-04-19T05:30:01 1776576601

It's not terribly uncommon for an undergrad to claim they're doing "PhD level work".

Undergrad research is pretty common and it's not all that hard to get your name on a paper as an undergrad. A lot of undergrads think that doing work that gets your name on a paper, equates to PhD level work.

raincole · 2026-04-19T01:12:22 1776561142

Now at least you're an adult already. Imagine what mixed messages schoolchild's are receiving from their teachers...

an0malous · 2026-04-19T01:44:45 1776563085

> I’m expected to do Ph.D level work as an undergrad and am expected to use AI.

Nice idea. What class and what work are you doing then?

ninjahawk1 · 2026-04-19T02:15:43 1776564943

For that specific one, it’s more of an independent project analyzing complex systems for 6 credits, I’m gonna be expected to submit a paper to arXiv on the subject with the professor as a co-author (fingers crossed). He said I can use claude code or any AI. I’m required to do X amount of hours per week and then submit a thorough report after about 2 months.

leptons · 2026-04-19T01:14:04 1776561244

>I’ve learned more by using AI to do things about my head than hard core studying for a semester.

How do you know you actually learned, instead of being fed slop by the AI that isn't true at all? If you didn't study, then I doubt you'll really know if the AI is lying to you or not. I have to wonder if your teacher will too, sounds like they have kind of checked-out from actually teaching.

ninjahawk1 · 2026-04-18T23:59:45 1776556785

In my opinion the entire prediction market is simply a way to legalize insider trading and war profiteering.

ninjahawk1 · 2026-04-06T22:30:13 1775514613

OpenAI is like #3 or #4 of the AI companies right now in terms of power, and last place in the court of public opinion.

I’d be more concerned about Anthropic both being in the good graces of the public and having access to all of our computers indirectly with Claude Code.

0x3f · 2026-04-06T22:32:17 1775514737

OpenAI has ~30x the userbase of Anthropic.

aduffy · 2026-04-06T22:55:59 1775516159

I'm not sure how much of that converts to revenue. If it's free plan users, that's just cost. You can say what you want about "creating a training data moat" but that doesn't seem like it's prevented the other labs from putting out excellent models.

0x3f · 2026-04-06T23:02:20 1775516540

Well we were talking about power and reputation and being well-known and all that. Being more ubiquitous is surely a big part of that. GP seems to think Anthropic is doing better because of the DoD thing. In my estimation, 90% of people do not care about that at all.

hellojimbo · 2026-04-06T23:22:25 1775517745

Around the same revenue due to Anthropics strong enterprise strategy

0x3f · 2026-04-06T23:49:40 1775519380

Perhaps, but I'd venture the ear of the regime is even more valuable.

ninjahawk1 · 2026-04-06T23:45:36 1775519136

They’re all in the negative excluding subsidies, hard core coders are more valuable than high schoolers cheating on homework.

estearum · 2026-04-06T22:38:05 1775515085

makes sense if you think the point of journalism is just to take everyone down a notch instead of... um... informing the public of bad actors

"the local drug-dealing pimp is so passe, we need to investigate the most upstanding members of the community just to be sure" is a frankly insane strategy

ninjahawk1 · 2026-04-05T18:47:09 1775414829

Not just employment rates, women are also taxed for a “pink” tax on products they need…all while it now requires a working household to afford to live.

ninjahawk1 · 2026-04-05T18:33:10 1775413990

It’s never “for the children”, it’s about control and money.

HN For You