I've many times seen Claude try to execute a command that it's not supposed to, the harness prevents it, and then it writes and executes a python script to do it.
How much more? Depends on the system doesn't it? I don't know how many systems have proc mounted but don't you get it from /proc/self/root?
Anyway that's beside the point, which is that it doesn't have to "be malicious" to try to overcome what look like errors on its way to accomplishing the task you asked it to do.
Until it gets prompt injected. Are you reading every single file your agent reads as part of the tasks you give it, including content fetched from the web or third-party packages?
What would you title this article to make it less "clickbait"? This is one of the least clickbait headlines I've seen, it's literally just describing what's in the article.
> Realistically, what are the odds that our not very large or clever brains really do have the potential to understand the entire universe
My belief on this is not entirely rational, of course, but it seems to me that there's probably a sort of Turing-completeness for intelligence/understanding, where as soon as a mind starts being able to understand abstraction, given enough time and resources, it can probably understand the entire universe.
It would also be presumptuous to say that brainfuck is equally powerful to every other programming language that exists, and yet we know it to be true. The fundamental reason we can prove that Turing-complete languages are equivalent to each other is that we can build the same abstractions in both, so intuitively it feels like a similar principle holds for human intelligence.
I'm not seeing anything in that graph that implies that o1 ever fails on "what is 6*1?" The chart is graphing the number of digits on each axis; it fails on "what is (some 6 digit number) * (some 1 digit number)"
Your initial translation into JavaScript is a representation of the statement "All my things are green hats", which is not the same as "All my hats are green."
The statement "All my hats are green" would map to
reply