For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | brianush1's commentsregister

claude is stupid but not malicious; chroot is sufficient

I've many times seen Claude try to execute a command that it's not supposed to, the harness prevents it, and then it writes and executes a python script to do it.

breaking a chroot takes more than that..

How much more? Depends on the system doesn't it? I don't know how many systems have proc mounted but don't you get it from /proc/self/root?

Anyway that's beside the point, which is that it doesn't have to "be malicious" to try to overcome what look like errors on its way to accomplishing the task you asked it to do.


That doesn't mean claude can't do it, chroot is better than nothing but not a real solution

Malice is not required. If it thinks it is in the right, then it will do whatever it takes to get around limitations.

Sure, it's not malicious. But it is very eager to get things done, and surprisingly inventive and knowledgeable in all kinds of workarounds.

Until it gets prompt injected. Are you reading every single file your agent reads as part of the tasks you give it, including content fetched from the web or third-party packages?

Claude is far from stupid from my experience. I've used so many models and Claude is king.

One could make the argument that higher-level thought is not the same as awareness of higher-level thought; perhaps language only affords the latter.


What would you title this article to make it less "clickbait"? This is one of the least clickbait headlines I've seen, it's literally just describing what's in the article.


> Realistically, what are the odds that our not very large or clever brains really do have the potential to understand the entire universe

My belief on this is not entirely rational, of course, but it seems to me that there's probably a sort of Turing-completeness for intelligence/understanding, where as soon as a mind starts being able to understand abstraction, given enough time and resources, it can probably understand the entire universe.

It would also be presumptuous to say that brainfuck is equally powerful to every other programming language that exists, and yet we know it to be true. The fundamental reason we can prove that Turing-complete languages are equivalent to each other is that we can build the same abstractions in both, so intuitively it feels like a similar principle holds for human intelligence.


apparently the pegs only have to touch the holes, they don't have to line up perfectly


ahhhhhhh thanks


I'm not seeing anything in that graph that implies that o1 ever fails on "what is 6*1?" The chart is graphing the number of digits on each axis; it fails on "what is (some 6 digit number) * (some 1 digit number)"



Your initial translation into JavaScript is a representation of the statement "All my things are green hats", which is not the same as "All my hats are green."

The statement "All my hats are green" would map to

    things.every(thing => thing.type != 'hat' || thing.color == 'green')
i.e., everything the person owns must either be green or, if it isn't green, it must not be a hat since all hats are green.

The negated form would then be

    things.some(thing => thing.type == 'hat' && thing.color != 'green')
i.e., there are some hats that are not green.


pretty sure you have a typo, should be "If not, consider (sqrt(2) ^ sqrt(2)) ^ sqrt(2)."


it works with textbooks too though


Isn't doing the exercises a lot more efficient?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You