For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | more musebox35's commentsregister

I think this is more about mechanistic understanding vs fundamental insight kind of situation. The linear algebra picture is currently very mechanistic since it only tells us what the computations are. There are research groups trying to go beyond that but the insight from these efforts are currently very limited. However, the probabilistic view is very much clearer. You can have many explorable insights, both potentially true and false, by jıst understanding the loss functions, what the model is sampling from, what is the marginal or conditional distributions are and so on. Generative AI models are beautiful at that level. It is truly mind blowing that in 2025, we are able to sample from the megapixel image distributions conditioned on the NLP text prompts.


If were true then people could predict this AI many years ago


If you dig ml/vision papers from old, you will see that formulation-wise they actually did, but they lacked the data, compute, and the mechanistic machinery provided by the transformer architecture. The wheels of progress are slow and requires many rotations to finally reach somewhere.


I have deep respect for cuda and Nvidia engineering. However, the arguments above seem to totally ignore Google Search indexing and query software stack. They are the king of distributed software and also hardware that scales. That is way TPUs are a thing now and they can compute with Nvidia where AMD failed. Distributed software is the bread and butter of Google with their multi-decade investment from day zero out of necessity. When you have to update an index of an evolving set of billions of documents daily and do that online while keeping subsecond query capability across the globe, that should teach you a few things about deep software stacks.


That is insightful. Courage to take risks means higher standard deviation in outcomes, more visible successes, but also more hard failures. Risk averse cultures have more stable outcomes, no big successes, but also less financially crippling failures. A personal or social safety net may or may not make you risk averse. Taking semi-calculated risks seems like a skill that needs to be learned for successful entrepreneurship.


The computations in transformers are actually generalized tensor tensor contractions implemented as matrix multiplications. Their efficient implementation in gpu hardware involves many algebraic gems and is a work of art. You can have a taste of the complexity involved in their design in this Youtube video: https://www.youtube.com/live/ufa4pmBOBT8


As a 15+ years emacs user the only item on my wishlist is client-server remote editing mode similar to that of vs code. Then I can go back to using emacs on cloud VMs. Does anyone know a solution to this that works as good as VS Code even when your latency is high? Hopefully, I will be pissed off with all the weird configuration flags of VS Code enough to write one myself ;-) To be fair its python integration is quite good at least for the usual stuff.


Two approaches might work here:

  1) Run Emacs on your local machine and use Tramp to edit the remote files

  2) Run Emacs on the remote machine with the files you're editing. This likely means running in the terminal itself (emacs -nw  or requivalently emacs -t).


In certain contexts 20% is a lot bucks, leaving that on the plate would be very wasteful ;-)


Yes, it would be 20% wasteful. But giving up freedom can be more costly.

Also, the 20% would be open to further optimization by the community, so it wouldn't be that bad in practice, probably.


In some commercial contexts with the savings from that 20%, you can buy a lot freedom and then with the freedom you bought you can make more free things :)


Could all be true, maybe, somehow. But I sleep better when my castle is not in someone else's kingdom. That alone is enough for me to accept the small performance penalty.


Google Deepmind is the closest lab to that idea because Google is the only entity that is big enough to get close to the scale of AT&T. I was skeptical that the Deepmind and Google Brain merge would be successful but it seems to have worked surprisingly well. They are killing it with LLMs and image editing models. They are also backing the fastest growing cloud business in the world and collecting Nobel prizes along the way.


Sadly this naturally happens in any field that ends up expanding due to its success. Suddenly the number of new practitioners outnumbers the number of competent educators. I think it is a fundamental human resources problem with no easy fix. Maybe llms will help with this, but they seem to reinforce the convergence to the mean in many cases as those to be educated is not in a position to ask the deeper questions.


> Sadly this naturally happens in any field that ends up expanding due to its success. Suddenly the number of new practitioners outnumbers the number of competent educators. I think it is a fundamental human resources problem with no easy fix.

In my observation the problem rather is that many of the people who want to "learn" computer science actually just want to get a certification to get a cushy job at some MAGNA company, and then they complain about the "academic ivory tower" stuff that they learned at the university.

So, the big problem is not the lack of competent educators, but practitioners actively sabotaging the teaching of topics that they don't consider to be relevant for the job at a MAGNA company. The same holds for the bigwigs at such companies.

I sometimes even see the conspiracy that if a lot of graduates saw that what their work at these MAGNA involves is from the history of computer science often decades old and has been repeated multiple times over the decades, this might demotivate the employees who are to believe that they work on the "most important, soon to be world changing" thing.


I agree, it is another important factor. Pandemic pay and hire rates certainly accentuated this.


And that last 5% is the toughest nut to crack. There is a reason waymo is way ahead even if they can not scale. Cameras are passive devices with relatively poor dynamic range and low light behavior. They are nowhere near a match/replacement for the human eye. Just try to picture a 5 year old at dusk or indoors and what you see will not be what you get.


Agree that the last fiew percentage points are exponentially more difficult each step of the way. What's your metric for saying Waymo is ahead, in terms of tech? They are strictly geo fenced, limited to specific road types, and often get stuck/confused. Also their system is very expensive, and not scalable to million of cars. Your point about cameras seems odd. Cameras have much better low light performance than human eyes. And cars have headlights.


waymo already has driverless taxi service in a major us city and is expanding. Tesla is in the process. again this is if they cover the last 5%. Scalability arguments wont matter when they can not launch such a service. And no, cmos cameras are close but are not better than the human eye in low light unless you have an ir camera and can flood everywhere with active ir lights. they are certainly inferior in dynamic range. I have been doing vision for more than two decades and I would not be comfortable in a camera only robotaxi at high speed. Certainly not at night or under adverse weather conditions. But this is all speculation of course. Considering fully autonomous driving at scale has been a major unrealised promise for the past 10 years, I stand by my assessment until I see a major advancement in camera technology or affordable active sensors.


Honestly, if you have any actual interest in LLMs or other generative ai variants, just go after a concrete goal post that you yourself set with measurable metrics to gauge your progress. Then the predicted timeline from podcasts and blog posts will become irrelevant. Experts and non-experts have both been terrible at predicting timelines since the dawn of ai. Self driving cars and llms are no exception. When you are making predictions based solely on intuition and experience it is mostly an extrapolation. It is not useless. It always helps to ask questions and try to frame the future within the bounds of our current understanding. But at the same time it is important to remember that this is just speculation, not empirical science. That is also why there is such varied opinions on the topic of ai timelines. Relax and enjoy witnessing a major leap in our understanding of natural language, vision, and high dimensional probabilistic vector spaces ;-)


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You