DuckDB is both a standalone and a component. This effort is actually very coherent and brings it back into a familiar usage model — that of a traditional client server RDBMS.
RDBMS have always been multi-user concurrent systems. DuckDB is a very fast local engine that has a multitude of use cases because it is a embeddable in other systems.
It’s like saying what does SQLite wanna be? It’s in your phones, your browser, your desktop apps, iot devices and people have extended it in different directions. The only difference here is that this is first party not third party. But to me it’s a very legible move.
But why though? DuckDB can still be used as a local query engine — I still use it as that. I haven’t touched any of the DuckLake stuff and the duckdb cli and Python library are still my bread and butter. They can add new use cases, but it doesn’t affect the core engine.
Is the concern that the duckdb messaging is now diluted by it having all these extra features? That you can’t sell it to friends as “this thing” like you can a one use tool like curl? I get that, but I also feel that duckdb is so much bigger than a “do one thing and do it well” tool.
It’s an engine that drives the modern data tool stack. Duckdb’s team has been prescient in that it has made many tasteful bets on what users want —- the ability to interop with pandas and polars, addition of geospatial, the plug-in infra. They’re all optional but when you neeed these things, they’re so useful. They’ve also clued me into what the broader data world is thinking about (I didn’t know about sketches and hilbert, but those are so useful in probailistic large scale queries and in geospatial queries). And they exist in larger database systems like Redshift too.
So far duckdb’s bets have been tasteful, and mostly ignorable if you don’t happen to use them.
I read it less as "DuckDB wants to become Postgres" and more as DuckDB becoming an execution layer inside bigger workflows.
The engine is often not the painful part anymore. The pain is the stuff around it: live DBs, S3 paths, Parquet files, credentials, repeatable runs, exports, validation, and the moment a one-off script quietly becomes infrastructure.
Quack makes the remote/server part cleaner, but the bigger trend seems to be DuckDB becoming the SQL layer inside tools, not necessarily the final user-facing tool.
Our data pipeline produces .duckdb files that our app downloads (it watches the asset in S3 and pulls when etag changes). Makes it easy to get BQ/Clickhouse like performance without running or paying for that infrastructure. Not perfect for all cases, but it handles a lot more than you would expect.
The use case is local user DuckDB talking to MotherDuck for $.
This is not commercially a terrible idea. Why keep paying Snowflake for bog-standard SQL query workload when SF makes it easy to migrate to Iceberg & commodity engines like MotherDuck?
Hello, DuckDB DevRel here. Quack is independent from MotherDuck. MotherDuck has its own proprietary protocol, which has been around for years and it supports things like dual execution – see more here:
Sure! Not knocking the architecture: Building out peer-to-peer federation in place of client/server makes perfect sense for DuckDB. And I’m a big fan of owning the protocol so you can optimize it to internal structures.
Just making the point that DuckDB is disruptive technology & what it’s most likely to disrupt.
Compared to what exactly? Snowflake? Hiring an engineer to deploy DuckDB? A hobby project? FWIW I work at MotherDuck so obviously biased, but curious to hear what makes you say that.
uh, doing analytics type queries on large datasets that postgres would choke on, as an RPC? I'm using it (ducklake specifically) to build a lakehouse RPC server that can scale horizontally based on resource utilization in k8s.
Right, I get that usecase. You have to crunch numbers that sit somewhere, and store the outputs in the same place. DuckLake is great for that. But where does this DuckDB client-server setup fit in?
Sounds like it means you don't have to wire up the RPC server yourself anymore? Just build a docker container that invokes this quack server command, expose it over the network and connect to it from remote clients using your own access controls?
Ducklake handles the metadata and storage, but a local duckdb instance connected to it still has to do the compute itself. This lets you federate access to the compute.
Fun for me, I just finished a big streaming implementation doing essentially the same thing in Go-gRPC with arrow table record batches. It was fun though.
If you see the state of what is called "production quality" software these days, alpha/beta quality has lost most of its meaning. You now just wait for "next prod"
I went with MiniMax. The token plans are over what I currently need, 4500 messages per 5h, 45000 messages per week for 40$. I can run multiple agents and they don't think for 5-10 minutes like Sonnet did. Also I can finally see the thinking process while Anthropic chose to hide it all from me.
rotations are usually two phased. Add new secret/credential to endpoint, and both new and old are active and valid. Release new secret/credential to clients of that endpoint, and wait until you dont see any requests using the old credential.
Then you remove the old credential from the endpoint.
Ideally, you can have a couple of working versions at any given time. For instance, an AWS IAM role can have 0 to 2 access keys configured at once. To rotate them, you deactivate all but one key, create a new key, and make that new key the new production value. Once everything's using that key, you can deactivate the old one.
I have an even cheesier competitor, which randomly has a dragon on the lid (it would be a terrible choice for all but the wimpiest casual gaming... but it makes a good Home Assistant HAOS server!)
I can run my N100 nuc at 4W wall socket power draw idle. If I keep turbo boost off, it also stays there under normal load up to 6W full power. Then it is also terribly slow. With turbo boost enabled power draw can go to 8-10W on full load.
Not sure how this compares to the OrangePI in terms of performance per watt but it is already pretty far into the area of marginal gains for me at the cost of having to deal with ARM, custom housing, adapters to ensure the wall socket draw to be efficient etc. Having an efficient pico psu power a pi or orange pi is also not cheap.
Boost enabled.
WiFi disabled.
No changes to P clock states or something from bios.
Fedora.
Applied all suggestions from powertop.
I don’t recall changing anything else.
Not the poster you're replying to, but I run an Acer laptop with an N305 CPU as a Plex server. Idle power draw with the lid closed is 4-5W and I keep the battery capped at 80% charge.
The N100/150/200/etc. can be clocked to use less power at idle (and capped for better thermals, especially in smaller or power-constrained devices).
A lot of the cheaper mini PCs seem to let the chip go wild, and don't implement sleep/low power states correctly, which is why the range is so wide. I've seen N100 boards idle at 6W, and others idle at 10-12W.
That's quite a lot for the very heatsink that still results in those overheating problems I mentioned. A standard CPU cooler will not be mountable on this in any reasonable way, that's like parking a truck on a lawn chair.
I recommend using syncthing. It's very easy to self host but I actually use a SaaS for it: syncding.com. It gets me 100gb to 1tb of disk in which I can create folders and keep them synced with my 2 laptops, my phone, my server etc... I have an Obsidian vault with Meld Encrypt to encrypt some files, a keepassxc file I share across my devices and my todo.txt
It's simple to setup and will work forever instead of paying for different providers that might shut down or increase their prices.
I originally built it for my own setup (multiple devices, encrypted files, etc.), and it kind of grew from there.
Not everyone has a NAS at home, so the idea behind syncding.com was to provide a simple, encrypted online Syncthing hub that just works without the usual setup — with built-in ZFS snapshots for versioning and recovery.
Always cool to see others using similar workflows.
You: "they missed this feature that's in the docs !"
You're right that it's an important part of CC's config. But it doesn't fit the article's raison d'être.