For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | glalonde's commentsregister

"extrapolation" literally implies outside the extents of current knowledge.


Yes, but not necessarily new knowledge.

It means extending/expanding something, but the information is based on the current data.

In computer games, extrapolation is finding the future position of an object based on the current position, velocity and time wanted. We do have some "new" position, but the sistem entropy/information is the same.

Or if we have a line, we can expand infinitely and get new points, but this information was already there in the y = m * x + b line formula.


*Overstated


oops, yup.


"Optimize the kernel (in KernelBuilder.build_kernel) as much as possible in the available time, as measured by test_kernel_cycles on a frozen separate copy of the simulator." from perf_takehome.py


which makes sense in a portfolio, but a company only has one CEO in their basket. But I suppose if CEO elections represent stock holders with many companies, then perhaps it may still be optimized like a portfolio with multiple CEOs.


Correct.

The effect is more mild, not necessarily because the profile is different but because public companies don’t usually have 100x upside.


Only one at a time, but a CEO is still chasing that monumental success that their reputation can ride on later.


Is that true and is that sustainable? My understanding has been that only the relatively low level libraries/kernels are written in cuda and all the magical algorithms are in python using various ML libaries. It's like how intel BLAS isn't much of a moat -- there are several open source implementations and you can mix and match.

How is CUDA so sticky when most ML devs aren't writing CUDA but something several layers of abstraction above it? Why can't intel, AMD, google w/e come along and write an adapter for that lowest level to TF, pytorch or whatever is the framework of the day?


> Why can't intel, AMD, google w/e come along and write an adapter for that lowest level to TF, pytorch or whatever is the framework of the day?

A long long time ago, i.e. the last time AMD was competing with Intel (before this time, that is), we used to use Intel's icc in our lab to optimize for the Intel CPUs and squeeze as much as possible out of them. Then AMD came out with their "Athlon"(?) and it was an Intel-beater at that time. But AMD never released a compiler for it; I bet they had one internally, but we had to rely on plain old GCC.

These hardware companies don't seem to get that a kick-ass software can really add wings to their hardware sales. If I were a hardware vendor, I would, if nothing else, make my hardware's software open so the community can run with it and create better software; which will result in more hardware sales!


It's a good question. I think fundamentally its because no wants to/can compete with Nvidia when making a general purpose parallel processor. They all want to make something a bit more specialized, so they need to guess what functionality is needed and not needed.

This is a really tricky guess, case in point that AMD's latest chip cant compete on training because they could not get Flash Attention 2 working on the backward pass because of their hardware architecture. [1]

Attempts to abstract at a higher layer have failed so far because that lower layer is really valuable, again Flash Attention is a good example.

[1] https://www.semianalysis.com/p/amd-mi300-performance-faster-...


it had a gps


Atlas is also electric. An electric motor tied to a gear pump presumably.


have to appreciate the irony in worrying about the bitrate of simulated lo-fi crackling


same same


internal internal combustion engine engine


Presumably early waymo. Ian probably worked for bot & dolly which was bought by Google.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You