An aside, but I recently learned -- if one is willing to use a very modest amount of memory -- summing floating-point numbers with no loss of precision is effectively a solved problem with the XSUM algorithm.
That paper explains some useful optimisation details, but obviously since the floats are all (either infinity or) some multiple of a known tiny fraction (their smallest non-zero number), we can definitely sum them accurately.
I think you either haven't thought about this or you did your math wrong.
You need (2^e)+m+1 bits. That is more bits than would fit in the cheap machine integer type you just have lying around, but it's not that many in real terms.
Let's do a tiny one to see though first, the "half-precision" or f16 type, 5 bits of exponent, 10 bits of fraction, 1 sign bit. We need 43 bits. This will actually fit in the 64-bit signed integer type on a modern CPU.
Now lets try f64, the big daddy, 11 exponent, 52 fraction, 1 sign bit so total 2048 + 52 + 1 = 2101 bits. As I said it doesn't fit in our machine integer types but it's much smaller than a kilobyte of RAM.
Edited: I can't count, though it doesn't make a huge difference.
You also need some extra bits at the top so that it doesn't overflow (e.g., on an adversarial input filled with copies of the max finite value, followed by just as many copies of its negation, so that it sums to 0). The exact number will depend on the maximum input length, but for arrays stored in addressible memory it will add up to no more than 64 or so.
Thanks, you're right there for accumulating excess, I don't think you can actually get to 64 extra bits but sure, lets say 64 extra bits if you want a general purpose in memory algorithm, it's not cheap but we shouldn't be surprised it can be done.
> Notice I have changed the extension from .js to .mjs. Don’t worry, either extension can be used. And you are going to run into issues with either choice
As someone that has used module systems from dojo to CommonsJS to AMD to ESM with webpack and esbuild and rollup and a few others thrown in ... this statement hits hard.
Yeah, the commonjs to esm transition has been the python 2 to python 3 transition of JavaScript, except the benefits are limited (at least compared to the hassle created).
There are many libraries that have switched to esm only (meaning they don't support commonjs), but even today, the best way to find the last commonjs version of those libraries is to go to the "versions" tab on npm, and find the most downloaded version in the last month, and chances are, that will be the last commonjs version.
Yes, in a vacuum, esm is objectively a better than commonjs, but how tc39 almost intentionally made it incompatible with commonjs (via top-level awaits) is just bizarre to me.
It had to be incompatible with CommonJS regardless of top level await. There is no imaginable scenario where browsers would ship a module system with synchronous request and resolution semantics. A module graph can be arbitrarily deep, meaning that synchronous modules would block page load for arbitrarily deep network waterfalls. That’s a complete non-starter.
Given that, top-level await is a sensible affordance, which you’d have to go out of your way to block because async modules already have the same semantics.
Recently, Node has compromised by allowing ESM to be loaded synchronously absent TLA, but that’s only feasible because Node is loading those models from the file system, rather than any network-accessible location (and because it already has those semantics for CJS). That compromise makes sense locally, too. But it still doesn’t make sense in a browser.
Bundler engineer here. ESM is great when it comes to build-time optimisations. When a bundler runs into CJS code it literally deopts the output to accommodate - so from that side it's amazing.
But also, there's a special place in hell for the people that decided to add default exports, "export * from" and top level await.
Commonjs is also very weird as a "module" instance can be reassigned to a reference of another module
module.exports = require('./foo')
and there's no way to do this in ESM (for good reason, but also no one warned us). In fact major projects like React use CJS exports and the entire project cannot be optimized by bundlers. So, rather than porting to ESM, they created a compiler LOL
If I may, the evil is not in the top level await but in the use of a top level await in a module that exports. That is evil. But a top level await in a programs main script seems ok to me.
I haven't thought about that in years. I didn't realize it had been solved.
Browser support looks pretty good.
I guess now I have to figure out how to get this to play nice with Vite and TypeScript module resolution.... and now it's starting to hurt my brain again, great.
I recently found out that the Function object will compile any javascript you care to feed to it. At runtime! new Function('class MyClass { ...}; return MyClass') My system does not allow for "imports". No npm etc. So this is a bit of a lifesaver for me. I realize in js land it may not be as useful but it is pretty handy.
For those of us forced to be in the JS ecosystem, finally having a runtime that Just Works has been great.
Bun has replaced a massive number of tools and dependencies from our stack and really counteracted the tooling explosion that we were forced into with node.
In our case, it's not so much being forced to use Bun, but rather that Bun is in real terms infinitely more convenient than lower-level languages. Firstly, even the most novice of novices tend to have a passing familiarity with JS/TS, whereas this is not true for C/Zig/Rust/etc, so it's easier for people to contribute to our projects. Bun also provides so many things for free, statically, and cross platform. You want a TCP server? A websocket server? SQLite database? You want to include static assets? You want to generate static assets at compile time? Etc? Bun provides it.
Attempting to replicate even a modicum of this in lower-level languages can be a real struggle. Rust is definitively the least-worst in this respect because there's been a concerted effort by the community to provide stable packages that do most things. But Rust is a complicated and unapproachable language. Using other low-level languages like C/Zig, and you immediately run into issues of libraries and static linking. And even if you find a library, its documentation is either lacklustre or outright missing (looking at you libuv and libxev respectively).
The amount of manual setup and third-party builds-system finagling just to: 1) run a TCP server; 2) fetch data over HTTP; 3) do both of these using a single event loop (no separate threads); 4) use SQLite for storage; and 5) have all this produce a single self-contained executable. Yet I cannot understate how trivial this is with Bun.
I can see the point of trying to make a distinction, but it is muddy.
An emulator can operate on many different levels of trying to match the behavior of the underlying hardware. CPU emulators can emulate at the opcode level (easier) or try to increase accuracy by emulating the CPU pipeline cycle by cycle (harder).
In this particular case, the distinction between an "emulator" and a "hardware emulator" seem apt because the article discusses that the required fixes needed to start tracking the state of individual pins of the hardware chips. This, to me, represents that the emulation needed to "go down another level" and model the physical hardware to a certain degree to gain the needed accuracy.
The states decided to add one digit to these numbers to further subdivide
them. They did it differently, of course, and some didn't subdivide at all.
Some of them have typos with "O" in place of "0" in a few places. Some
states dropped the leading zeroes, and then added a suffix digit, which is fun.
Any identifier that is comprised of digits but is not a number will have a hilariously large amount of mistakes and alterations like you describe.
In my own work I see this all the time with FIPS codes and parcel identifiers -- mostly because someone has round-tripped their data through Excel which will autocast the identifiers to numeric types.
Federal GEOIDs are particularly tough because the number of digits defines the GEOID type and there are valid types for 10, 11 and 12-digit numbers, so dropping a leading zero wreaks havoc on any automated processing.
There's a lot of ways to create the garbage in GIGO.
There are 8 screen hole bytes in the bottom 8 text rows (64 bytes total) and 8 expansions slots, so the screen hole byte at offset "N" was often used to store up to 8 bytes of data[1] (one byte in each of the rows' screen hole area) by the expansion card's firmware in Slot "N". Overwriting those bytes could result in system crashes and hardware hangs.
Ah, I remember using the memory holes in the HIRES graphics memory for scratch-pad usage, but had forgotten about this part. I loved the Apple ][ since it (up to the //e) was capable of being fully understood by a single human being. Few if any computers since then have held that distinction
I was under the impression that playwright was (a very good) how your code behaves when actually running in a browser testing library, and vitest was a how your code behaves on its own testing library.
But I'd tend to have relatively independent test suites for my frontend model/state layer and the "how it behaves live in a browser" stuff because that way if I'm not currently iterating on my view components I can test the rest of my frontend code changes without having to faff with headless browser instances.
(obviously you'd then want to run both before committing, and again under CI before merge, but still)
Depending on how thin your model/state layer is (and if you have a Sufficiently Rich Backend then it can probably be very thin without that being an issue) your situation may not match mine, of course.
But y'know, free data point, worth exactly what you paid :D
Reminds me of Intellectual Venture's Optical Fence developed to track and kill mosquitoes with short laser pulses.
As a side-effect of the precision needed to spatially locate the mosquitoes, they could detect different wing beat frequencies that allowed target discrimination by sex and species.
This laser mosquito killer is, and always has been, a PR whitewashing campaign for Intellectual Venture's reputation.
This device has never been built, never been purchasable, and it is ALWAYS brought up whenever IV wants to talk about how cool they are.
And I say this as someone who loosely knows and was friends with a few people that worked there. They brought up this same invention when they were talking about their work. They eventually soured on the company, once they saw the actual sausage being made.
IV is a patent troll, shaking down people doing the real work of developing products.
They trot out this invention, and a handful of others, to appear like they are a public benefit. Never mind that most of these inventions don't really exist, have never been manufactured.
They hide the extent of their holdings, they hide the byzantine network of shell companies they use to mask their holdings, and they spend a significant amount of their money lobbying (bribing).
Why do they need to hide all of this?
Look at their front page, prominently featuring the "Autoscope", for fighting malaria. Fighting malaria sounds great, they're the good guys, right?
Now do a bit of web searching to try to find out what the Autoscope is and where it's being used. It's vaporware press release articles going back 8 years.
Look at their "spinouts" page, and try to find any real substance at all on these companies. It is all gossamer, marketing speak with nothing behind it when you actually go looking for it.
Meanwhile, they hold a portfolio of more than 40,000 patents, and they siphon off billions from the real economy.
Part of their "licensing agreement" is that you can't talk badly about them after they shake you down, or else the price goes up.
Surprisingly there are different algorithms for doing something as simple as summing up a list of numbers.
The naïve way of adding numbers one-by-one in a loop is an obvious way, but there are more sophisticated methods that give better bounds of the total accumulated error; Kahan summation[1] being one of the better-known ones.
Like most things, it can get complicated depending on the specific context. For streaming data, adding numbers one at a time may be the only option. But what if one could use a fixed-size buffer of N numbers? When a new number arrives what should be done? Take a partial sum of some subset of the N numbers in the buffer and add that to a cumulative total? If a subset if chosen, how? Are there provable (improved) error bounds for the subset selection method?
As far as I know, xsum [1] more or less solves the problem completely: order invariant and exact, at less than 2x the cost of the naive summation. Further speeding it up may be the only avenue left for improvement.
A couple of pull-quotes from the paper to summarize:
Much work has been done on trying to improve the accuracy of summation. Some methods aim to somewhat improve accuracy at little computational cost, but do not guarantee that the result is the correctly rounded exact sum.
Many methods have been developed that instead compute the exact sum of a set of floating-point values, and then correctly round this exact sum to the closest floating-point value. This obviously would be preferable to any non-exact method, if the exact computation could be done sufficiently quickly
Exact summation methods fall into two classes — those implemented using standard floating point arithmetic operations available in hardware on most current processors, such as the methods of Zhu and Hayes (2010), and those that instead perform the summation with integer arithmetic, using a “superaccumulator”.
I present two new methods for exactly summing a set of floating-point numbers, and then correctly rounding to the nearest floating-point number. ... One method uses a “small” superaccumulator with sixty-seven 64-bit chunks, each with 32-bit overlap with the next chunk, allowing carry propagation to be done infrequently. The small superaccumulator is used alone when summing a small number of terms. For big summations, a “large” superaccumulator is used as well. It consists of 4096 64-bit chunks, one for every possible combination of exponent bits and sign bit, plus counts of when each chunk needs to be transferred to the small superaccumulator.
On modern 64-bit processors, exactly summing a large array using this combination of large and small superaccumulators takes less than twice the time of simple, inexact, ordered summation, with a serial implementation
Thanks for the summary. I kinda low-key love the idea of converting floats into a fixed point representation that covers the entire range represented by the float type. I mean the accumulator is only 32 KB, which is likely to be in L1 the entire time on modern hardware, and any given float is only going to need two 64 bit words, + 13 bits (12 bits for offset, and 1 for sign) to be represented in this scheme.
Agreed, the differences aren't as stark as the map seems. Almost all of the states fall between 15% and 25% percent of the population.
Wisconsin stands out at 25.29% of the population labeled as excessive drinkers due to the large amount of dark red, but South Dakota is at 22.43% -- less than 3% difference, but that state is mostly orange.
For a county example, Hardin, TX is 19.13% and Rusk, WI is 20.82%. A difference of just under 1.7%, but the Wisconsin county is bright red and the Texas county is orange.
https://glizen.com/radfordneal/ftp/xsum.pdf