On the note of quay.io, is there a function on the website to see the Containerfile that created the artifact? I recall digging around in the past but couldn't really find a link to the source Containerfile.
It's to my understanding that the Neverware acquisition eventually led to ChromeOS Flex which is ChromeOS designed to run on any generic x86_64 platform.
No there isn’t. Elixir, and by extension Erlang and the BEAM VM, doesn’t have the concept of stackless coroutines.
Task.async/1 (or Task.async/3) at a high level just spawns another BEAM process links it to the currently running process, sets up a process monitor, and then runs. Task.await/2 just pulls from the mailbox of the current process for a returned message.
There is a semantic difference between just running a line of code or running ith with Task.async right? So then there is a distinct concept of sync vs async.
Everything in the beam is "low-level" async with implicit yield points at function boundaries. Task.async just gives you a convenient future so you can get the value back, instead of the beam throwing it away (goroutines also throw away the result). You could also write that boilerplate by hand. There are also other nice things that Task.async gives you but that's not relevant to this discussion.
IOW, async/await in the BEAM is a high level thing (like goroutines) which is one of two reasonable choices, the other reasonable choice being "what zig did".
The BEAM VM doesn't have faculties to accommodate any other form of concurrency other than synchronous, pre-empted green threads. Put simply, there is nothing other than synchronous in Elixir (edit: or Erlang or any other BEAM VM language). The Task module is basically just convenience functions (edit: convenience functions around existing BEAM VM functions for spawning BEAM processes or Erlang threads) meant to facilitate short term execution of, well, tasks.
Task.async/3 and Task.await/2 are, from the thousand foot view, the equivalent of:
parent = self()
spawn(fn -> send(parent, do_something()) end)
receive do
result -> IO.puts result
end
In Elixir/Erlang parlance, I have just spawned a thread where the BEAM VM handles both the internal data structures to facilitate concurrent-safe communications between threads and the scheduling in userspace.
Put more generally, the mentality of concurrency in Elixir/Erlang is the equivalent as that of traditional pthreads in C.
I think you are misunderstanding me, so let me ask a simple question:
Is it (except for performance) always 100% equivalent whether a calculation is done "locally" or whether it is ran on some other, potentially just spawned, thread and then retrieved from this thread?
If your answer is "no it is not always 100% equivalent" then we agree (and I think we do).
No, they're equivalent. It makes no difference whether I calculate in the current process, spawned a process, started a process via the Supervisor, Task.async, or what not. They are all equivalent in the sense that they use BEAM processes.
An Elixir, Erlang, or any other developer using the BEAM VM (or at the very least myself since I don't want to generalize) doesn't make a semantic distinction between spawn and Task.async/3 because there is none. The significant differentiation is whether or not the caller is expecting a returned value, if not then Task.start_link can be used instead.
I don't need to worry about function colouring because there is none. I also don't need to worry about async vs Task.run like in C# because in Elixir there is none. To be asynchronous in Elixir/Erlang/etc... is to be synchronous. To equate it to modern backends, we basically have a microservice but every service can directly call each other.
I think I also might have been talking around you a bit, the BEAM VM sets itself apart from C#'s async/await or the TPL due to its implementation of the actor model. This video by Sasa Juric can explain better than I can the general overview of the BEAM: [1].
But a rough tl;dw, a BEAM process is to the BEAM, what an OS process is to the OS. We don't define a main function as the process supervisor in BEAM spawns processes we define instead and those are our "entrypoints."
They are not equivalent. Running a function in another process incurs the overhead of copying the data to the other process (potentially over the network). In both directions. Task.async introduces extra necessities like a linkage between the two processes, a timeout receiver, etc, and the result receiver.
Moreover, there is a context switch and it's more likely the code will run on a different core if you run it async than if you run it in the same process (which is very likely but not guaranteed) to run on the same core.
The way in which they are equivalent, is that the code that you write is identical, and the bytecode that gets run, and the existence of implicit yields is identical between the async and "sync" code.
> They are not equivalent. Running a function in another process incurs the overhead of copying the data to the other process (potentially over the network). In both directions.
> Moreover, there is a context switch and it's more likely the code will run on a different core if you run it async than if you run it in the same process (which is very likely but not guaranteed) to run on the same core.
Yes I agree, but the question as presented by valenterry asks us whether or not there is some existing semantic difference and not under the confines of performance (both gains or losses). Regardless what you have stated are all true.
But as an aside, and not directed towards you, Task.async/3 doesn't do anything that the developer cannot already do. Even in the tutorial for the learning Elixir, the fledgling developer is exposed to the different mechanisms that power Task.async/3, the source code [1] reflects this, although supervisors are covered much later down the line. The documentation for
> The way in which they are equivalent, is that the code that you write is identical, and the bytecode that gets run, and the existence of implicit yields is identical between the async and "sync" code.
And just to add on for those not familiar with the BIFs, receive, which is what Task.await/2 and Task.yield/2 use under the hood, yields execution. NIFs are another one.
> They are all equivalent in the sense that they use BEAM processes.
Sure, but that was not my question. 100% equivalent means that there cannot be any observable change in the behaviour of the whole program/application (except for performance).
Or in other words: is the sole purpose of the existance of `Thread.async` (or spawn, pick whichever you want) to change performance characteristics? If not, what is the purpose of its existence then?
> Sure, but that was not my question. 100% equivalent means that there cannot be any observable change in the behaviour of the whole program/application (except for performance).
In a vacuum (i.e. via spawn/1 or spawn/3) there is no observable change in behaviour throughout the whole system.
If they're linked via spawn_link then a child process crashing means the parent process dies with it unless handled in some way.
So ultimately, no there isn't observable changes. My logging process doesn't care or even know that my WebSocket process crashed.
> Or in other words: is the sole purpose of the existance of `Thread.async` (or spawn, pick whichever you want) to change performance characteristics? If not, what is the purpose of its existence then?
As stated before Task.async/3 is just a convenience wrapper around low-level BEAM primitives, there isn't anything special about Task.async/3 that you couldn't do via spawn.
The reason being that the BEAM VM schedulers prioritizes overall latency of the system over throughput by enforcing reduction calls (parlance for a function call, but specifically for tracking looping in the BEAM) at around 4000 calls, then it moves the current process back into the queue; repeat ad infinitum.
So you don't get massive speedups just spawning a new BEAM process but your independent units of execution still enjoy the same overall latency as before. If you want performance increases in the BEAM you would need to jump to dirty schedulers and NIFs which carry their own dangers.
> In a vacuum (i.e. via spawn/1 or spawn/3) there is no observable change in behaviour throughout the whole system.
Okay, let me follow up on this one. So could we take arbitrary Erlang/Elixir programs and automatically/mechanically rewrite them to push previously inlined calculations to be run on a different process (using spawn) or the otherway around - and that without causing any observable difference in behaviour in any situation except for difference in performance?
> As stated before Task.async/3 is just a convenience wrapper around low-level BEAM primitives, there isn't anything special about Task.async/3 that you couldn't do via spawn.
You can likewise answer my question for the BEAM primitives (like processes/threads): is the sole purpose of the existance of those primitives to change performance characteristics? If not, what is the purpose of its existence then?
Also, just so that we understand each other: I'm a big fan of the BEAM. This is not a criticism or anything; I just want to explain why I believe that function coloring (or however you call it) can never be conceptionally be removed from a language if sync/async is to be supported.
> Also, just so that we understand each other: I'm a big fan of the BEAM.
We probably should've established this way earlier, sorry if I came across as pretentious or off putting.
I'll answer in a bit of reverse order.
> I just want to explain why I believe that function coloring (or however you call it) can never be conceptionally be removed from a language if sync/async is to be supported.
To respond in more of a roundabout way, why doesn't C have colouring issues if it too supports asynchronicity (I'm pretty sure I've made up a word but hope the point gets accross fine) via pthreads?
The colouring issue is, at least under my understanding of C#'s async/await system is to accommodate for the fact that Rosyln transforms the C# source into .NET IL with a state machine. It's, to me at least, just a consequence from what layer of the stack we talk at. Erlang doesn't need to worry about since the runtime was designed to run with this specific model of concurrency in mind; but if we look at C#, .NET only supports using OS threads (though David Fowler is investigating green threads in the, iirc, labs repository on the DotNet Github) the async/await is, for lack of a more better word, bolted on. It's to my understanding that IronPython (or maybe it was PythonNet) has difficulties calling into C# async for that reason; though if I'm wrong please correct me.
> is the sole purpose of the existance of those primitives to change performance characteristics? If not, what is the purpose of its existence then?
Fault tolerance, latency, scalability, perhaps minor throughput increases (yeah this is a bit of a contradiction to what I said earlier, but I've revised my opinion that you wouldn't see massive performance gains unless resorting to NIFs).
The Phoenix Framework's 2 million websocket connection challenge I think is a best demonstrator for the use case of BEAM processes. A specific websocket connection dies? No problem, the Supervisor respawns it. Two million connections receiving a Wikipedia length article on a topic without the entire system coming to a crawl? No problem. There isn't a need to worry about kqueue, poll/epoll, or IOCP here, just fire off a task for the purpose and let it do it's thing.
But overall you wouldn't spawn a BEAM process just so you could compute the matrices behind RAID6/RAIDZ2, you'd delegate those to NIFs. The biggest gains in overall performance came from BeamAsm instead.
> Okay, let me follow up on this one. So could we take arbitrary Erlang/Elixir programs and automatically/mechanically rewrite them to push previously inlined calculations to be run on a different process (using spawn) or the otherway around - and that without causing any observable difference in behaviour in any situation except for difference in performance?
Yes. Using the case of the 2 million websocket connection challenge earlier, it doesn't really matter if each of those connections spawned another process as part of its routines, the other processes don't know and don't care. Taking a more generic case of Elixir's GenServers (or gen_server in Erlang), when I perform a call (as in the GenServer behaviour) it doesn't matter to the calling process what happens behind the scenes, it just blocks and waits for a response. The GenServer could fire off any number of processes it wants but the calling process doesn't care about that.
> why doesn't C have colouring issues if it too supports asynchronicity
But it does have those issues. In fact it is worse than in most other languages - just because there is no distinct support of async in a language does not mean it doesn't have async capabilities. C does have those, you just don't have any language support whatsoever, you have to take care yourself. Delegating the execution to the OS doesn't change that.
For example, if you want to transform elements in an array concurrently, you now have to use pthreads. You cannot do that while using the same code as before (which is would removing function coloring is all about).
Maybe we have a misunderstanding here: I'm talking abouyt sync/async not the async/await syntax that many languages have. The latter is to help deal with colored functions and the former is what enables the existance of concurrency. Any language that is - in one way or the other - capable of running code concurrently is suffering from that problem. It can be avoided by enforcing synchronous execution only, which some languages do (especially some DSLs), but most general purpose languages support concurrency.
And once they do, there's two choices: the programmer has to deal with it (using Thread.async, spawn, pthreads, go channels, promises, ...) or they don't. If they don't then sync and async code must look exactly the same (or be able to be converted forth and back automatically). And this goal (either making it look the same or converting it automatically) is just not solvable in general. It has been tried for decades.
For instance, check the paper "A Critique of the Remote Procedure Call Paradigm" from 1988. I think it was from Tanenbaum. It lists some of the problems that just cannot be solved in general. Some programs will always behave different after the automatic conversion, it's impossible to prevent that.
> > Okay, let me follow up on this one. So could we take arbitrary Erlang/Elixir programs and automatically/mechanically rewrite them to push previously inlined calculations to be run on a different process (using spawn) or the otherway around - and that without causing any observable difference in behaviour in any situation except for difference in performance?
> Yes.
What I'm trying to explain is that you are mistaken. It just is not possible. Maybe in specific cases, sure, but not in a safe way for every usecase that can be thought of.
And neither Erlang nor Elixir (nor the latest BEAM language, I forgot the name), luckily, try to do that. They give this controller to the developer and let them deal with the decision of how and when to use concurrency. They just try to make that as easy as possible, but they don't try to make sync and async code look and be the same.
This is a massive segue, but where would memory-unsafe languages be acceptable for use at Cloudflare? I'm assuming existing software that needs to be maintained would be one of the them but what about low-level firmware, kernel code, and so on? Would Rust would a replacement to C in those areas too?
We did do some encryption with LUKS, and I’d try to write over boot records, keys, and headers, but I was pessimistic that was enough. Not an encryption expert myself. Always felt that any given encryption tech (be it hardware or software) has possibility of vulnerability later found or backdoors.
So it made sense to me that a physical erasure prior to recommission would be good. There’s also regulatory/compliance checkboxes (be them effective or not).
Write a C program that does something interesting or useful then compile it with -S and inspect the assembly.
I am a beginner at assembly and learned everything from GCC.
There's so many details. I learned about rbp+rsp registers recently. And from Reddit post on the compiler someone told me I need to keep the stack aligned.
For example I saw that in your blog [1] you used the base on line 1 `FROM ghcr.io/cgwalters/fedora-silverblue:37`.
I was wondering where I could find the Containerfile for that, my Google-fu and searching through the Fedora pages are failing me
[1] https://www.ypsidanger.com/building-your-own-fedora-silverbl...