I like to explain things by example, and interactive examples are even better. So I built a sandbox server that lets you run virtually any software. And a widget to easily add interactive code examples to any kind of technical documentation.
Great tips, thank you! The thing is, getting maximum throughput is not the goal of the project (at least not at this stage). I'm using reasonable SQLite defaults (including WAL), but that's it for now.
Thank you! I also think that the relational model can get you pretty far if you don't need to squeeze every last bit of performance out of the program. And the added benefit of using a battle-tested SQL engine is far fewer storage-related bugs.
SQLite only allows one writer at a time, so concurrent writes will fail with a "database is locked" (SQLITE_BUSY) error.
There are two ways to enforce the single writer rule:
1. Use a mutex for write operations.
2. Set the maximum number of DB connections to 1.
Intuitively, the mutex approach seems better, because it does not limit the number of concurrent read operations. The benchmarks show the following results:
- GET: 2% better rps and 25% better p50 response time with mutex
- SET: 2% better rps and 60% worse p50 response time with mutex
Due to the significant p50 response time mutex penalty for SET, I've decided to use the max connections approach for now.
This is really not true in WAL mode with synchronous NORMAL, this was only true with the default journal mode and a lot of people are misusing sqlite because of that. You still have one writer at a time but you wont get the SQLITE_BUSY error.
You can check the documentation [1], only some rare edge cases return this error in WAL. We abuse our sqlite and I never saw it happen with a WAL db.
How about having two pools, one for writes only, and the other one for reads? SQLite allows you to open the DB more than in one thread per application, so you can have a read pool and a write pool with SetMaxConnections(1) for better performance. This of course also means that reads should be handled separately from writes in the API layer too.
Well I agree, that's a good starting point. You probably won't be able to beat Redis with SQLite anyway :), although given that WAL mode allows for concurrent reads it might give it a large enough performance boost to match Redis in terms of QPS if the concurrency is high enough.
I'm a big fan of both Redis and SQLite, so I decided to combine the two. SQLite is specifically designed for many small queries[1], and it's probably as close as relational engines can get to Redis, so I think it might be a good fit.
I love this, it’s the solution that makes sense for 90% of the times I have used redis with python.
I’ve made several versions of this, and to be honest, it ended up being so straightforward that I assumed it was a trivial solution.
This is pretty well-planned. This is 100% the way to go.
Heh. I took a detour into making my idea of “streams” also solve event sourcing in native python; dumb idea, if interesting. Mission creep probably killed my effort!
What are the project goals?
I assume it's a drop-in replacement for Redis that is supposed to be better in certain cases? If yes, then what cases do you have in mind?
So the goal is to have a Redis-like API but not actually be an in-memory data store and reduce memory consumption this way? For example, for a project/service that started with Redis, but then priorities shifted and small memory footprint became more important than performance? Did I get it right?
That's pretty cool. Reckon it would work with existing code that calls Redis over the wire for RQ?
https://python-rq.org
This RQ stuff has been a pain with a recent project because only Python seems to use it, so once an RQ job has been submitted only Python based things can do anything with it. :(
If Redka works as a backend replacement, we could potentially have non-Python things check the SQLite database instead.
It works with redis-py (which python-rq uses), but I doubt it will be any good in this case. python-rq seems to use Lua scripting in Redis, which is not planned for 1.0. I'd rather not add it at all, but we'll see.
You want to use an in-mem database like SQLite or DuckDB or even replicate the data structures in-code that redis uses. Redis doesn't really add a whole lot to the table over a general programming language with a good standard library other than being shared over a network.
There's also an interactive "Git by Example" guide I recently wrote on a similar topic. It has a playground for each of the commands mentioned in Martin's article (and many others):
You can use it to build your own sandboxes, or use existing ones to write interactive guides like these: https://github.com/nalgeon/tryxinyminutes