For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | pu_pe's commentsregister

This is partly why this talk about AI "solving science" should be taken with a grain of salt. Here the authors intentionally poisoned the publication record, but there are millions of papers out there that are also garbage, and it would be very hard for either a human or a LLM to distinguish them from actual work.

I agree with the general insight here. Python is great for humans but once they are out of the loop it's no longer as useful. Having a compiler is more useful for LLMs indeed.

However we are moving one step closer to complete inability for humans to understand the code, as there are likely 100x more developers with experience in Python than Rust. If humans are indeed going to be the bottleneck then perhaps this is inevitable, and languages fitted especially for LLMs will dominate.


I actually believe we need to rethink Git for modern needs. Saving prompts and sessions alongside commits could become the norm for example, or I could imagine having different flags for whether a contribution was created by a human or not.

This doesn't seem to be the direction these guys are going though, it looks like they think Git should be more social or something.


Idk how git works under the hood but those both seem like they could both be easily accomplished with git itself .

but if not just your own work flow, have a dir dedicated to storing prompt history and then each file is titled with the commit id.

As for the flag just agree to some convention and toss it in the commit message


Actually, it is. We're currently leading a conversation among several players in this space to agree on a metadata standard that helps make attaching, collaborating on and transmitting information like this simple, extensible and scalable.

Keep an eye on our blog to see how we're doing this, and how we're doing it in a way that hopefully the entire community joins us in a way where we're not all reinventing the same wheels.


What do people expect to do with these saved prompts/contexts? Nobody is going to read through them, right? I suppose the thinking is LLMs will, but any decently active codebase will soon contain far too much context for any current LLM. Is this the same thinking behind cryonics, ie. we may be able to use this stuff one day so let's start saving it now? Hoarding has ruined many people and it will ruin us all if we're not careful...

For me the reason would be to preserve traces of intentionality (ie what was the user trying to achieve with this commit?). These days a 10k LOC commit might be triggered by a 100-word user prompt, there is a lot more signal in reading the prompt itself than the code changes.

I mean, it's just text, so it shouldn't be too taxing to store it. I agree it's hoarder mentality though :)


remove the existing code, add feature to the prompt and re-generate everything, probably

>Saving prompts and sessions alongside commits could become the norm for example, or I could imagine having different flags for whether a contribution was created by a human or not.

Yes, it could have syntax like

    git notes add -m "Claude prompt: foo fee faa foo" <commit-hash>
and then the tooling could attach any metadata to it that is desired.

OH WAIT YOU CAN DO THAT ALREADY SINCE 2009

Seriously, the 90% complaints about git not being able to do something is just either RTFM or "well, it can, but could use some better porcelain to present to user"


> I could imagine having different flags for whether a contribution was created by a human or not.

Only useful if it can be reliably verified, which is challenging at best.

The point of git is that it has strong authentication built into the fabric of the thing.


Paper-thin, AI generated article that doesn't address its actual premise. We have no evidence that model collapse is happening at all.


It's crazy to me that this is not considered fraud. You sign up for a yearly plan under a given assumption of functionality, then they just change the terms to give you less than what they agreed to without compensating you in any way. That's textbook fraud.


The English word for it is similar too (Mythomaniac)


I think most English speakers who recognize it will think of it as meaning a collection of myths, as in the "Cthulhu Mythos".


I appreciated the writeup and your clarification.

I wonder whether this was your first attempt to solve this issue with LLMs, and this was the time you finally felt they were good enough for the job. Did you try doing this switch earlier on, for example last year when Claude Code was released?


Honestly, I was very adverse to agentic code up until Opus came out. The hallucinations and false confidence it had in objectively wrong answers just broke more things than it fixed.

However after it came out it suddenly behaved closely to what they marketed it as being. So it was my first real end-to-end project relying on AI at the front seat. Though design wise it is nowhere near perfect, I was holding it's hand the entire way throughout.


I really like the tool and how you designed the UI, well done! Very interesting use case and a slick interface.


Thanks!


Fascinating stuff. Any chance of using a sparse autoencoder or some other method to try to grasp what the model is actually doing in those middle layers? It would be quite cool to get a better sense of what type of input it is getting in the first time it goes through the reasoning circuit compared to the second or third time.


> Like with any LLM project, the first 90% of the work was super smooth and barely needed my intervention. The last 10% was a slog.

The author doesn't really describe which part was a slog, I thought autoresearch was supposed to be pretty much set and forget.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You