Being 10x (or whatever multiplier) faster at programming doesn't mean you're going to be 10x faster in designing a product or any other aspect that goes into making a good product.
Even if you hired an actual programmer, it'd take a massive amount of time to build a Photoshop clone.
Of course, at the end Photoshop is lines of code and it could be output as is, end to end. One problem is that users aren't generally giving very precise design documents which would narrow the way to interpret them into code in precisely one way. Or that a design document at any level of precision, other than code, couldn't be interpreted in multiple ways when it comes to a specific implementation.
LLMs also take a relatively long time to output acceptable code, often taking tens of minutes before giving you a small diff. The larger the codebase, the longer it usually takes to start producing code, even over an hour.
> Of course, at the end Photoshop is lines of code and it could be output as is, end to end
And yet it’s not.
> LLMs also take a relatively long time to output acceptable code, often taking tens of minutes before giving you a small diff. The larger the codebase, the longer it usually takes to start producing code, even over an hour.
The problem is that they don’t generate acceptable code, they generate code that needs to be edited to be acceptable. That has always been the slow part of engineering. Waiting an hour for a bugfix even if it cost $75 in tokens would be cheaper than hiring an engineer but only if it worked. And it’s a bit like hiring a snake oil salesman - it passes the sniff test but it’s only when you’re drowning in the fact that your ai now takes 4 hours to fix the same bugs, and it introduces new bugs _and_ you don’t have anyone who can reduce that complexity that you see the reality. For a lot of us, that is immediately clear from first glance at the output of Claude and codex and the likes.
That was my point as well. That it hasn't been output, even though it could be done by a talented solo developer given enough time, and that current LLMs definitely aren't able to do so.
> The problem is that they don’t generate acceptable code, they generate code that needs to be edited to be acceptable.
You've never had an LLM output a one line bugfix that is correct to the point where you don't have to edit it?
To make things more concrete, here's an example from the creator of Redis on how he utilizes LLMs in programming: https://antirez.com/news/164
> You've never had an LLM output a one line bugfix that is correct to the point where you don't have to edit it?
I have. I’ve also had IDEs and static analysers do the same thing. I can also take my car out of gear and have it roll down a hill but that doesn’t mean it can run without fuel. Only a sith deals in absolutes, and in the general cases LLMs don’t generate acceptable code.
The part about positional encoding is not correct.
> The intuition: instead of adding position info to each token’s vector, RoPE rotates the vector by an angle that depends on its position
You can't rotate the token's entire vector (or all three vectors, whatever is being implied is unclear). You rotate each token's Query and Key vectors only, so dot product can be used to tell how far apart the tokens are when comparing token 1's Query vector to token 2's Key vector.
Positional embedding should just be explained after explaining the Query, Key and Value vectors. When the article explains those only after that, the reader is building up on a wrong intuition and it gets confusing.
If a project has a rule to not submit AI generated PRs, people should never submit AI generated PRs to that project. It's spam. Or if the rule is more nuanced than that in relation to AI, it should be respected.
It's 100% an issue with the people with submitting these PRs. So, if someone has a history of having no issue with breaking project rules (let alone being arrogant about it), it should be a massive red flag about the person for any possible employer or future collaborator checking their profile, etc.
Why people are wilfully destroying their own reputation like that is beyond me. It's infinitely better to have no activity at all on your profile than to create a track record of bad behaviour.
I don't really know if this would be the best solution, but we could maybe define a sliding LoC limit per PR? Also a limit on the number of open PRs per contributor. This way those small, reviewable contributions would get a fair shot at being landed instead of being blanket banned, and the maintainer overhead would be more reasonable.
Even that small one-line contribution is not the contributor’s own work. Since the contributor does not hold the copyright to it, they do not have the right to contribute it to the project.
AI data centers are being already used at max capacity, aren't they? I have a hard time imagining people would suddenly use AI less than they do as of today, let alone collectively drop it altogether. So the worst case scenario is that they'd need to be auctioned off way under what they'd be worth now, but still for someone to use them for AI.
Inference is much cheaper than training a new model, so running them just for inference is a completely different thing than having to price in the fact that at the moment all of these companies need to compromise between compute for inference and compute for training new models. If no new models were to be trained, and all the compute was inference only, that would change everything when it comes to the overall compute cost of AI.
Dotcom infra buildup is a bad comparison, in that it wasn't even close to being all utilized. The infra was completely overproportional to the day to day usage.
AI data centers that exist and are operational are running at maximum capacity. That's why you see things like the tiny little data center run by xai showing up as a valuable resource to xai (on the sale side) and anthropic (buy side). It is "only" 300 megawatts and there's a 1.25 billion rent on it per month.
If all these other data centers were anywhere near coming on line, that 300mw data center would be a rounding error not a line item as it is right now.
So someone's signed contracts for way more and way larger data centers, someone's purchased billions in hardware for these not yet operational data centers. I'm wondering how depreciation's going to work on all these assets...
Anyhow, I'm not really sure what "max capacity" is here, nor am I really aware when they're going to be delivering the operational assets that are currently levered to their eyeballs and consuming 1/3rd of the memory made on the planet.
As far as inference vs training, have new gotten radically better than old models or only marginally (at the cost of 10x or more the training costs)?
I would day that the dotcom was directionally correct but the timing was wrong. For instance you had pets.com in 1999 but in 2020 you had chewy.com. It's like you had broadcast.com in 2000 but by 2020 you had YouTube that was making more in ad revenue than the next 4 largest competitors.
I imagine the trend for AI usage will go up over the very long term (5-10yrs etc.), but short term how much usage is being propped up by employer's forcing their employees to use it? Or by user's being curious about the novelty but ultimately abandoning it if it doesn't do what they want? It'll be interesting to see what changes as tokenmaxxing disappears.
Something that reads like an LLM wrote it is different from an LLM having written it to begin with. Something written by an LLM can be something that doesn't have the hallmarks of LLM all over it.
I was just saying that the original quote doesn't strike me as something that's an annoyingly good piece of LLM writing.
There's a lot of experimentation happening on how to get LLMs to write well, starting with half of what's been posted on Gwern's blog as of late.
I am not a lawyer but I’m pretty sure you can’t just slap an MIT or whatever else license on public code with an intentional trojan hidden in it and expect to not be held accountable for the damages caused by the trojan running.
If the damage resulted from an unexpected problem like a bug, then you’re probably fine. But this phrase was intentionally placed by the author and intended to inflict at least a little damage (destroy code) onto specific users.
Whether some words are legally equivalent to an actual virus, I couldn’t say.
I remember someone posting about the most human of all traits being reassuring to see: Typos. I'm pretty sure people are not as averse to leaving or finding typos in text as they were 5 years ago, as these days it's a signal of humanity.
Same has been applying to art for a while. Several artists who have an "AI-ish" style have been wrongfully crucified for using AI. And been forced to post videos of their process end to end, in order to prove that they aren't using AI. It's a thing for artists to post their new stuff with: "AI could never do this."
Slightly off topic, I'm not sure, but way back in the day (2000ish) a friend of mine used PERL scripts to scrape all the big databases which existed at the time, namely IMDB.
He used PERL for scraping and the same for generating "new" content with what he had scraped. He made a bunch of static websites with ads. The sites connected to each other. Sports, Celebrities, Movies, you name it. He had a formula.
Those sites were profitable enough that he could travel the world and have fun, basically. His mother collected the cheques from Google and cashed them into his account.
Now to get to the final point here, his secret was simply TYPO INJECTION to avoid Google's then embryotic duplicate content detection.
MCP is a way to define tools that works with many apps and has a lot of extra functionality built in, it's not the only way, but it's popular because many apps support it. You can also make tools using the opencode API or any other API, and you can give them large descriptions that take up a lot of context. No matter how you define the tools, they are injected into the context of the model using the same chat template provided by the developers of the model.
Even if you hired an actual programmer, it'd take a massive amount of time to build a Photoshop clone.
Of course, at the end Photoshop is lines of code and it could be output as is, end to end. One problem is that users aren't generally giving very precise design documents which would narrow the way to interpret them into code in precisely one way. Or that a design document at any level of precision, other than code, couldn't be interpreted in multiple ways when it comes to a specific implementation.
LLMs also take a relatively long time to output acceptable code, often taking tens of minutes before giving you a small diff. The larger the codebase, the longer it usually takes to start producing code, even over an hour.
reply