A method call like `.trunc()` is still going to be abysmally less ergonomic than `as`. It relies on inference or turbofish to pick a type, and it has all the syntactic noise of a function call on top of that.
Not to mention this sort of proliferation of micro-calls for what should be <= 1 instruction has a cost to debug performance and/or compile times (though this is something that should be fixed regardless).
> A method call like `.trunc()` is still going to be abysmally less ergonomic than `as`. It relies on inference or turbofish to pick a type, and it has all the syntactic noise of a function call on top of that.
If `as` gets repurposed for safe conversions (e.g. u32 to u64), there's some merit to the more hazardous conversions being slightly noisier. I'm all for them being no noisier than necessary, but even in my most conversion-heavy code (which has to convert regularly between usize and u64), I'd be fine writing `.into()` or `.trunc()` everywhere, as long as I don't have to write `.try_into()?` or similar.
> Not to mention this sort of proliferation of micro-calls for what should be <= 1 instruction has a cost to debug performance and/or compile times (though this is something that should be fixed regardless).
I fully expect that such methods will be inlined, likely even in debug mode (e.g. `#[inline(always)]`), and compile down to the same minimal instructions.
Yes, this is specifically what I'm disagreeing with.
> I fully expect that such methods will be inlined, likely even in debug mode (e.g. `#[inline(always)]`), and compile down to the same minimal instructions.
Many things in the language theoretically go through a trait as well, except that we have special cases in the compiler to handle those traits more efficiently. If this were a performance issue, there's no reason we couldn't do the same for `.trunc()` or `.into()`.
The compiler doesn't have to implement a call as a call; having "magic functions" calls to which are special-cased by the code generator is an old and time-honored tradition.
I've been using https://messages.google.com to get something like the desktop iMessage experience with Android- does that work for your use case? (I don't use iMessage so I could just be missing some killer feature it has, or something.)
The tokenizer is not really a good demonstration of the differences between these styles. A more representative comparison would be the later stages that build, traverse, and manipulate tree and graph data structures.
I think a reasonable comparison would have to be DoD Rust parser vs current Rust parser. Comparing across languages isn't very useful, because Zig has very different syntax rules, and doesn't provide diagnostics near the same level as Rust does. The Rust compiler (and also its parser) spends an incredible amount of effort on diagnostics, to the point of actually trying to parse syntax from other languages (e.g. Python), just to warn people not to use Python syntax in Rust. Not to mention that it needs to deal with decl and proc macros, intertwine that with name resolution, etc. etc. This all of course hurts parsing performance quite a lot, and IMO would make it both much harder to write the whole thing in DoD, and also the DoD performance benefits would be not so big, because of all the heterogeneous functionality the Rust frontend does. Those are of course deliberate decisions of Rust that favor other things than compilation performance.
Your points here don't really make sense. There are many ways you can apply DoD to a codebase, but by far the main one (both easiest and most important) is to optimize the in-memory layout of long-lived objects. I won't claim to be familiar with the Rust compiler pipeline, but for most compilers, that means you'd have a nice compact representation for a `Token` and `AstNode` (or whatever you call those concepts), but the code between them -- i.e. the parser -- isn't really affected. In other words, all the fancy features you describe -- macros intertwined with name resolution, parsing syntax from other languages, high-quality diagnostics -- don't care about DoD! Our approach in the Zig compiler has evolved over time, but we're slowly converging towards a style where all of the access to the memory-efficient dense representation is abstracted behind functions. So, you write your actual processing (e.g. your parser with all the features you mention) just the same; the only real difference is that when your parser wants to, for instance, get a token (as input) or emit an AST node (as output), it calls functions to do that, and those functions pull out the bytes you need into a lovely `struct` or (in Rust terms) `enum` or whatever the case may be.
Our typical style in Zig, or at least what we tend to do when writing DoD structures nowadays, is to have the function[s] for "reading" that long-lived data (e.g. getting a single token out from a memory-efficient packed representation of "all the tokens") in the implementation of the DoD type, and the functions for "writing" it in the one place that generates that thing. For instance, the parser has functions to deal with writing a "completed" AST node to the efficient representation it's building, and the AST type itself has functions (used by the next phase of the compiler pipeline, in our case a phase called AstGen) to extract data about a single AST node from that efficient representation. That way, barely any code has to actually be aware of the optimized representation being used behind the scenes. As mentioned above, what you end up with is that the actual processing phases look more-or-less identical to how they would without DoD.
FWIW, I don't think the parser is our best code here: it's one of the oldest "DoD-ified" things in the Zig codebase so has some outdated patterns and questionable naming. Personally, I'm partial to `ZonGen`[0] as a fairly good example of a "processing" phase (although I'm admittedly biased!). It inputs an AST and outputs a simple tree IR for a subset of Zig which is analagous to JSON. Then, for an example of code consuming that generated IR, take a look at `print_zoir`[1], which just dumps the tree to stdout (or whatever) for debugging purposes. The interesting logic is in `PrintZon.renderNode` in that file: note how it calls `node.get`, and then just has a nice convenient tagged union (`enum` in Rust terms) value to work with.
I also don't know all the details, but the Rust parser tokens contain horrible crimes, primarily because of macros. All I wanted to say was that applying DoD to the parser in Rust would (IMO) be much more difficult than in Zig, because language differences and different approaches to error reporting. Not saying it's impossible ofc. That being said, I don't really think so much effort would be worth here, the gain would be minimal in the grand scheme of things; we have bigger perf. problems than parsing.
"Hand-rolled assembly" was one item in a list that also included DoD. You're reading way more into that sentence than they wrote- the claim is that DoD itself also impacts the maintainability of the codebase.
Isn't stack overflow made safe via guard pages and probes (on sufficiently high-tier target platforms)? That is you should get a guaranteed error, even if that is a segfault, and not memory corruption.
What a thought-terminating way to approach an idea. Effects are not simply renamed conditions, and we have a whole article here describing them in more detail than that one sentence, so you can see some of the differences for yourself.
The handler doesn't have to follow the pattern of "do its work, resume the computation, go away."
It can instead do things like "do some work, resume the computation, do some more work."
Or even more invasively, "stash the computation somewhere, return from the handler site, let the rest of the program run for a while, then resume the computation."
The stuff Fil-C adds is on the same footing as `unsafe` code in Rust- its implementation isn't checked, but its surface area is designed so that (if the implementation is correct) the rest of the program can't break it.
Whether the amount and quality of this kind of code is comparable between the two approaches depends on the specific programs you're writing. Static checking, which can also be applied in more fine-grained ways to parts of the runtime (or its moral equivalent) is an interesting approach, depending on your goals.
> The stuff Fil-C adds is on the same footing as `unsafe` code in Rust- its implementation isn't checked, but its surface area is designed so that (if the implementation is correct) the rest of the program can't break it.
It’s not the same.
The Fil-C runtime is the same runtime in every client of Fil-C. It’s a single common trusted compute base and there’s no reason for it to grow.
On the other hand Rust programmers use unsafe all over the place, not just in some core libraries.
Yeah, that's what I meant by "depends on the specific programs you're writing." Confining unsafe Rust to core libraries is totally something people do.
There’s no reason to believe that one program is inherently representative. sudo-rs eschews dependencies and so is likely to be higher than most programs.
Furthermore, 170 uses in a 200 line program vs a one million line program are very different. I don’t know off hand how big sudo-rs is.
Even in embedded OS kernels, it’s often around 1%-5% of code. Many programs have no direct unsafe code at all.
I mean, again, yeah. I specifically compared the safe API/unsafe implementation aspect, not who writes the unsafe implementation.
To me the interesting thing about Rust's approach is precisely this ability to compose unrelated pieces of trusted code. The type system and dynamic semantics are set up so that things don't just devolve into a yolo-C-style free-for-all when you combine two internally-unsafe APIs: if they are safe independently, they are automatically safe together as well.
The set of internally-unsafe APIs you choose to compose is a separate question on top of that. Maybe Rust, or its ecosystem, or its users, are too lax about this, but I'm not really trying to have that argument. Like I mentioned in my initial comment, I find this interesting even if you just apply it within a single trusted runtime.