hasheddan's favorites

dspillett on May 22, 2018 | parent | context | on: Recursive Common Table Expressions in Postgres

Be careful of performance when using CTEs in Postgres: unlike in other DBMSs they are optimisation fences with regard to predicate pushdown so for some queries will result in extra scans for every level of call needed.

Doesn't affect all queries of course, and where is does the difference may not be significant compared to what else is going on (i.e. querying a small tree/graph structure to pull out some large/complex data), but it is something to watch out for when working with data of any appreciable size.

Nextgrid on June 12, 2020 | parent | context | on: Twilio Super Sim – Public Beta

If you're interested in this space, Hologram does the same thing (and appears to have had a head start compared to Twilio): https://hologram.io

KMag on March 10, 2016 | parent | context | on: Brendan Eich: WebAssembly is a game-changer

About the binary encoding... It's a bit easy to armchair these things, and it's too late for WebAsm now... but if you're on the V8 team, you have access to Google's PrefixVarint implementation (originally by Doug Rhode, IIRC from my time as a Google engineer). A 128-bit prefix varint is exactly as big as an LEB128 int in all cases, but is dramatically faster to decode and encode. It's closely related to the encoding used by UTF-8. Doug benchmarked PrefixVarints and found both Protocol Buffer encoding and Protocol Buffer decoding would be significantly faster if they had thought of using a UTF-8-like encoding.

LEB128 requires a mask operation and a branch operation on every single byte, maybe skipping the final byte, so 127 mask operations and 127 branches. Using 32-bit or 64-bit native loads gets tricky, and I suspect all of the bit twiddling necessary makes it slower than the naive byte-at-a-time mask-and-branch.

    7 bits -> 0xxxxxxx
    14 bits -> 1xxxxxxx 0xxxxxxx
    ...
    35 bits -> 1xxxxxxx 1xxxxxxx 1xxxxxxx 1xxxxxxx 0xxxxxxx
    ...
    128 bits -> 1xxxxxxx 1xxxxxxx 1xxxxxxx ... xxxxxxxx

Prefix varints just shift that unary encoding to the front, so you have at most 2 single-byte switch statements, for less branch misprediction, and for larger sizes it's trivial make use of the processor's native 32-bit and 64-bit load instructions (assuming a processor that supports unaligned loads).

    7 bits -> 0xxxxxxx
    14 bits -> 10xxxxxx xxxxxxxx
    ...
    35 bits -> 11110xxx xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx
    ...
    128 bits -> 11111111 11111111 xxxxxxxx xxxxxxxx ... xxxxxxxx

There's literally no advantage to LEB128, other than more people have heard about it. A PrefixVarInt 128 is literally always the same number of bytes, it just puts the length-encoding bits all together so you can more easily branch on them, and doesn't make them get in the way of native loads for your data bits.

Also, zigzag encoding and decoding is faster than sign extension, for variable-length integers. Protocol Buffers got that part right.

Note that for security reasons, if there are no non-canonical representations, there can't be security bugs due to developers forgetting to check non-canonical representations. For this reason, you may want to use a bijective base 256[0] encoding, so that there aren't multiple encodings for a single integer. In the UTF-8 world, there have been several security issues due to UTF-8 decoders not properly checking for non-canonical encodings and programmers doing slightly silly checks against constant byte arrays. A bijective base 256 saves you less than half a percent in space usage, but the cost is only one subtraction at encoding time and one addition at decoding time.

[0]https://en.wikipedia.org/wiki/Bijective_numeration

sriram_malhar on Oct 20, 2019 | parent | context | on: Structure and Interpretation of Computer Programs ...

SICP shows up every few months on HN, and I upvote enthusiastically every single time!

I am a self-taught programmer, and survived a long time on C/C++/Java etc. I was smugly confident in my confined space.

Then I came across SICP. I discovered a world of closures, streams, object-orientation done using closures, infinite series using streams, lazy evaluation, functional programming, and so much more. It was such a sharp pivot that I registered for a PhD because I just had to know what else I took for granted. I was 42 when I registered (graduated at 48)!

HN For You