Thanks! This was step one in my daily driver stack - better observability. I also bundled up a bunch of other observability services in https://github.com/simple10/agent-super-spy so I can see the raw prompts and headers.
The next big layer for my personal stack is full orchestration. Something like Paperclip but much more specialized for my use cases.
Yes, this. They need as much lock-in as possible before IPO. Most likely less about cash flow and more about IPO story telling.
We'll know for sure when they add full OpenClaw-like features to Claude Code like remote channels & heartbeat support. Both are partially implemented already.
NemoClaw is mostly a trojan horse of sorts to get corporate OpenClaw users quickly ported over to Nvidia's inference cloud.
It's a neat piece of architecture - the OpenShell piece that does the security sandboxing. Gives a lot more granular control over exec and network egress calls. Docker doesn't provide this out of the box.
But NemoClaw is pre-configured to intercept all OpenClaw LLM requests and proxy them to Nvidia's inference cloud. That's kinda the whole point of them releasing it.
I can be modified to allow for other providers, but at the time of launch, there was no mention of how to do this in their docs. Kinda a brilliant marketing move on their part.
> the OpenShell piece that does the security sandboxing. Gives a lot more granular control over exec and network egress calls. Docker doesn't provide this out of the box.
Yeah, it's wild. I spent several weeks nearly full time on a deep dive of claw architecture & security.
The short of it - OpenClaw sandboxes are useful for controlling what sub-agents can do, and what they have access to. But it's a security nightmare.
During config experiments, I got hit with a $20 Anthropic API charge from one request that ran amuck. Misconfigured security sandbox issue resulted in Opus getting crazy creative to find workarounds. 130 tool calls and several million tokens later... it was able to escape the sandbox. It used a mix of dom-to-image sending pixels through the context window, then writing scripts in various sandboxes to piece together a full jailbreak. And I wasn't even running a security test - it was just a simple chat request that ran into sandbox firewall issues.
Currently, I use sandboxes to control which agents (i.e. which system prompts) have access to different tools and data. It's useful, but tricky.
> It used a mix of dom-to-image sending pixels through the context window, then writing scripts in various sandboxes to piece together a full jailbreak.
That would be one interesting write-up if you ever find the time to gather all the details!
The full version has all the build artifacts Opus created to perform the jail break.
It also has some thoughts on how this could (and will) be used for pwn'ing OpenClaws.
The key takeaway: OpenClaw default setup has little to no guardrails. It's just a huge list of tools given to LLM's (Opus) and a user request. What's particularly interesting is that the 130 tool calls never once triggered any of Opus's safety precautions. For its perspective, it was just given a task, an unlimited budget, and a bunch of tools to try to accomplish the job. It effectively runs in ralph mode.
So any prompt injection (e.g. from an ingested email or reddit post) can quickly lead to internal data exfiltration. If you run a claw without good guardrails & observability, you're effectively creating a massive attack surface and providing attackers all the compute and API token funding to hack yourself. This is pretty much the pain point NemoClaw is trying to address. But its a tricky tradeoff.
Just sharing the most surreal 3-hour claude assisted debugging session I've ever had. This was intended for my team notes but thought I'd share it. It's a treasure trove of how OpenClaw sandboxing works when things go wrong.
The article might be too hasty to report that he's Jewish, especially implying that was the motive by including it in the article title. Lots of chatter X/Twitter about it.
Kinda crazy (scary?) how fast tragic events like this get instantly politicized on social.
It's possibly related to mistaken identity. There's apparently some guy with a similar name that has some tie to Israel. Articles and social just seem to be running with it without any fact checking.
The next big layer for my personal stack is full orchestration. Something like Paperclip but much more specialized for my use cases.