The host is written in Rust, with `extern "C"`, which makes it able to be loaded as a C library by programs written in other languages. Most languages have support for this.
It's also designed to be run in an event loop. I've tested this with Bun's event loop that runs TypeScript. I haven't tried it with other async runtimes, but it should be doable.
As for the browser, I haven't tried it, but you might be able to compile it to WASM -- the async stuff would be the hardest part of that, I suspect. Could be cool!
I generally agree. TypeScript is a great language, and JS runtimes have certainly had a lot of money and effort poured into them for a long time. I would add WASM to this category, as probably the closest thing to Mog. Write a program in some language, compile it to WASM, and load it into the host process. This is (probably) nice and safe, and relatively performant.
Since it's new, Mog will likely not yet beat existing systems at basically anything. Its potential lies in having better performance and a much smaller total system footprint and complexity than the alternatives. WASM is generally interpreted -- you can compile it, but it wasn't really designed for that as far as I know.
More generally, I think new execution environments are good opportunities for new languages that directly address the needs of that environment. The example that comes to mind is JavaScript, which turned webpages into dynamically loaded applications. AI agents have such heavy usage and specific problems that a language designed to be both written and executed by them is worth a shot in my opinion.
JIT means the code is interpreted until some condition kicks in to trigger compilation. This is obviously common and provides a number of advantages, but it has downsides too:
1) Code might run slowly at first.
2) It can be difficult to predict performance -- when will the JIT kick in? How well will it compile the code?
With Mog, you do have to pay the up-front cost of compiling the program. However, what I said about "no process startup cost" is true: there is no other OS process. The compiler runs in process, and then the compiled machine code is loaded into the process. Trying to do this safely is an unusual goal as far as I can tell. One of the consequences of this security posture is that the compiler and host become part of the trusted computing base. JITs are not the simplest things in the world, and not the easiest things to keep secure either. The Mog compiler is written entirely in safe Rust for this reason.
This up-front compilation cost is paid once, then the compiled code can be reused. If you have a pre-tool-use hook, or some extension to the agent itself, that code runs thousands of times, or more. Ahead-of-time compilation is well-suited for this task.
If this is used for writing a script that agent runs once, then JIT compilation might turn out to be faster. But those scripts are often short, and our compiler is quite fast for them as it is in the benchmarking that I've done -- there are benchmarking scripts in the repo, and it would be interesting to extend them to map out this landscape more.
Also, in my experience, in this scenario, the vast majority of the total latency of waiting for the agent to do what you asked it is due to waiting for an LLM to finish responding, not compiling or executing the script it generated. So I've prioritized the end-to-end performance of Mog code that runs many times.
Hum, IIRC, using your definition of an AOT compiler, then V8 is an AOT compiler. V8 never interprets code. It immediately compiles it to machine code. It improves it later, but it's never slow.
V8 is a JIT compiler that uses the Ignition [1] interpreter and only compiles sections of code down to machine instructions once they've been marked as hot via TurboFan [2].
V8 can also go back and forth from machine instructions back to bytecode if it identifies that certain optimization assumptions no longer hold.
Did somebody say 'global namespace'? I spent years working on one of those as part of Urbit... In general, I think you're right. Each conversation is an append-only log at the lowest layer, and I see no reason not to expose that fact as a global namespace, as long as permissions are handled gracefully.
Of course getting permissions to work well might be easier said than done, but I like this direction.
Hi, I'm the other author on this paper. You've asked a good question. I had originally planned on writing an agentic_reduce operator to complement the agentic_map operator, but the more I thought about it, the more I realized I couldn't come up with a use case for it that wasn't contrived. Instead, having the main agent write scripts that perform aggregations on the result of an agentic_map or llm_map call made a lot more sense.
It's quite possible that's wrong. If so, I would write llm_reduce like this: it would spawn a sub-task for every pair of elements in the list, which would call an LLM with a prompt telling it how to combine the two elements into one. The output type of the reduce operation would need to be the same as the input type, just like in normal map/reduce. This allows for a tree of operations to be performed, where the reduction is run log(n) times, resulting in a single value.
That value should probably be loaded into the LCM database by default, rather than putting it directly into the model's context, to protect the invariant that the model should be able to string together arbitrarily long sequences of maps and reduces without filling up its own context.
I don't think this would be hard to write. It would reuse the same database and parallelism machinery that llm_map and agentic_map use.
Cool! It'll be interesting to follow your work. I've been thinking, as well, about quorum and voting systems that might benefit from some structure. The primitives you've described are great for the "do N things one time each" case, but sometimes I (and the AI) want "do one thing N times: pick the best somehow". (I mean, you can express that with map/reduc over small integers or something, but still: different flavor.) You can even bring public choice theory into it.
Today, Urbit has chat, blogging, a basic weather app, and a lot of programming facilities. Eventually it should be able to run a document editor, and you can do things similar to internet browsing even on today's Urbit.
Urbit is a long-term project starting from the foundations and working up the stack. It'll be stable soon, but it probably won't be ready to run something as heavy as a modern web browser for a while. That being said, it does work, and it's fun to play with as-is.
Yes, that's an important point. You'll have a service contract with the infrastructure node responsible for routing packets to you -- you'll pay them to route to you. There are two layers of infrastructure nodes to facilitate scaling and hopefully maintain a healthy competitive market for routing services.
Urbit is a program you can run on Linux or MacOS intended to provide a complete personal computing experience on its own. It runs as a virtual machine for now, although it could run as a unikernel on bare metal (good project for a contributor who's interested!).
This VM acts like an operating system, in the sense that it loads and runs other applications within itself, and in the sense that it presents an application switcher and overall system management tools to the user.
This VM is designed from scratch to be as simple as possible, based on the thesis that the reason everyone has thought of a personal server but nobody runs one is that it's too complicated to do your own sysadmin.
Why is it complicated to do your own sysadmin? Because Linux is 15 million lines of code, and then there are tons of layers on top of that. What percentage of programmers even know how the internet works? A fair number of programmers have a decent sense for some corner of the modern computing world, but even seasoned professionals don't usually know the full structure of the digital world. How does BGP interact with the IP protocol? How do you make sure fsync() actually did what you wanted it to do? How does Linux overcommit_memory work? etc.
Urbit is weird, but that's mostly because it's a parallel universe of computing, not because it's inherently crazier than the alternative. We all have Stockholm Syndrome about 'ls -alH', and don't tell me 'grep' is an intuitive name.
In fact, there are very few basic building blocks in Urbit: binary trees of integers, the idea of a persistent event-log-based computer, and cryptographic identities. Pretty much everything is constructed out of those components.
And it's designed for a modern world with billions of users who might not all be completely trustworthy, so whole categories of complexity go away -- such as NATs.
So there's no standard industry term for describing this system, because there are no direct analogs or competitors.
So what exactly makes it an OS? JVM is a virtual machine but we don't consider it an OS. So why not just call Urbit a virtual machine?
> What percentage of programmers even know how the internet works? A fair number of programmers have a decent sense for some corner of the modern computing world, but even seasoned professionals don't usually know the full structure of the digital world. How does BGP interact with the IP protocol? How do you make sure fsync() actually did what you wanted it to do? How does Linux overcommit_memory work? etc.
Seeing how Urbit runs on OS's such as Linux, uses TCP/IP for networking over the Internet, which itself relies on all those fun WAN protocols like BGP to make it work, it's not really solving those problems, just hiding it beneath another layer of complexity and terminology.
from the FAQ:
"In early 2019, Curtis left the Urbit project and gave all of his voting interest (both as address space and voting shares in the company) back to Tlon. He retains a non-voting, minority interest in both the address space and the company — but is not involved in the day-to-day development or operations."
Here is the ending of the second-to-last entry in Unqualified Reservations by Curtis Yarvin (AKA Mencius Moldbug)[0]:
> Anyway. UR will reemerge, of course. But not here, and not soon—and probably not even in this form. I’ll also try to do something non-lame with the archives. Thanks for reading!
If the name alone isn't enough, I feel like this context makes it pretty obvious that he named Urbit after his racist blog.
He claims to have been working on Urbit since 2002, and the first UR post was from 2007. It's entirely believable to me that the name Urbit came first (albeit not publicly), but the names are very obviously related and the quote I pulled above shows that he considers some post-2014 project a continuation of UR. I can believe the specific claims of the comment you linked, but I think it dances around the fact that these two projects are purposefully tied together by nomenclature. Whether UR or Urbit was named first in Yarvin's head isn't really the point.
Yes. Not exactly a standard term, but there is no standard term for what Urbit is. It's VM, personal server, yes -- but those don't clearly convey the scope of the project.
How is it a personal server if I need a personal server to run it? Am I buying compute power on other people's computers? Also, why call it an OS if it doesn't replace existing ones?
It’s a personal server OS. MS-DOS was emulated on mainframes before it was run directly on hardware. Urbit wants to do the same, starting by being emulated on traditional computer OSes, and eventually moving to bare metal.
What's a personal server OS? Will I use it to write text documents and browse the web? Can I run the X Windowing system on it? It appears to me it's currently accessed through a browser. Is that temporary?
It's also designed to be run in an event loop. I've tested this with Bun's event loop that runs TypeScript. I haven't tried it with other async runtimes, but it should be doable.
As for the browser, I haven't tried it, but you might be able to compile it to WASM -- the async stuff would be the hardest part of that, I suspect. Could be cool!