spanspan's comments

spanspan · on Aug 14, 2023

It does have joins

rvba · on Aug 14, 2023

The splash page says that Learndb supports the following: select, from, where, group by, having, limit, order by

I dont see "JOIN" there.

So either the DB doesnt support it or the documentation (on the splash page!) is wrong.

Thanks for the downvote.

DANmode · on Aug 14, 2023

> Thanks for the downvote.

Please don't do this here.

Additionally, it could have been anyone.

spanspan · on Aug 14, 2023

I began using python as a way to "mock" out the overall design; intending to re-implement it in rust. The main reasoning for using python: was the ability to focus on "high-level" concepts and speed of tinkering. This implements a single process, single thread, single connection database- so performance and low-level concurrency control were not explicit goals or really optimized for. For those (real-live concerns) rust or C++ are much better; but also come with their set of complexities.

spanspan · on Aug 14, 2023

Re: ACID guarantees

It doesn't have a notion of atomically batching multiple statements, i.e. transaction. But beyond that, it's a single file database, which can only have a single process (learndb instance) that is operating on the database (file). So you get consistency and isolation via being a single connection database. Durability, you get to the extent that the file system is durable. So it's somewhere on the ACIDity spectrum.

Re: Query planning/optimization

I haven't implemented this; but I've considered where the optimization could module sit: The parser spits out an AST. This or a derived intermediate representation could be optimized,i.e. the AST could be rewritten or nodes deleted, before the VM executes the AST.

spanspan · on Aug 14, 2023

̶T̶h̶i̶s̶ ̶s̶t̶r̶i̶n̶g̶ ̶i̶s̶ ̶t̶h̶e̶ ̶g̶r̶a̶m̶m̶a̶r̶ ̶t̶h̶a̶t̶ ̶l̶a̶r̶k̶ ̶u̶s̶e̶s̶ ̶t̶o̶ ̶p̶a̶r̶s̶e̶ ̶t̶h̶e̶ ̶u̶s̶e̶r̶ ̶s̶u̶b̶m̶i̶t̶t̶e̶d̶ ̶s̶q̶l̶ ̶i̶n̶t̶o̶ ̶a̶n̶ ̶A̶S̶T̶.̶ ̶T̶h̶i̶s̶ ̶p̶a̶r̶s̶i̶n̶g̶ ̶i̶s̶ ̶f̶a̶r̶ ̶m̶o̶r̶e̶ ̶c̶o̶m̶p̶l̶e̶t̶e̶ ̶a̶n̶d̶ ̶r̶o̶b̶u̶s̶t̶ ̶t̶h̶a̶n̶ ̶w̶h̶a̶t̶ ̶d̶o̶i̶n̶g̶ ̶t̶h̶i̶s̶ ̶i̶n̶ ̶p̶u̶r̶e̶ ̶p̶y̶t̶h̶o̶n̶ ̶w̶o̶u̶l̶d̶ ̶a̶l̶l̶o̶w̶ ̶(̶w̶i̶t̶h̶o̶u̶t̶ ̶o̶f̶ ̶c̶o̶u̶r̶s̶e̶ ̶i̶m̶p̶l̶e̶m̶e̶n̶t̶i̶n̶g̶ ̶t̶h̶e̶ ̶e̶n̶t̶i̶r̶e̶ ̶l̶e̶x̶e̶r̶ ̶a̶n̶d̶ ̶p̶a̶r̶s̶e̶r̶ ̶t̶h̶a̶t̶ ̶l̶a̶r̶k̶ ̶i̶m̶p̶l̶e̶m̶e̶n̶t̶s̶)̶.̶

T̶h̶e̶r̶e̶ ̶m̶a̶y̶ ̶b̶e̶ ̶s̶o̶m̶e̶ ̶p̶o̶s̶t̶ ̶p̶a̶r̶s̶i̶n̶g̶ ̶v̶a̶l̶i̶d̶a̶t̶i̶o̶n̶ ̶t̶h̶a̶t̶ ̶c̶a̶n̶ ̶b̶e̶ ̶d̶o̶n̶e̶ ̶h̶e̶r̶e̶-̶ ̶b̶u̶t̶ ̶t̶h̶a̶t̶ ̶w̶o̶u̶l̶d̶ ̶b̶e̶ ̶s̶o̶m̶e̶t̶h̶i̶n̶g̶ ̶t̶h̶a̶t̶'̶s̶ ̶b̶e̶y̶o̶n̶d̶ ̶t̶h̶e̶ ̶d̶o̶m̶a̶i̶n̶ ̶o̶f̶ ̶t̶h̶e̶ ̶p̶a̶r̶s̶e̶r̶.̶

Edit: I see what you mean. I surveyed a bunch a parser generator libraries, and they also seemed to use a text based DSL- rather than DSL based on python structures. What you're describing would have made the grammar development more ergonomic and simple.

spanspan · on Aug 13, 2023

Single file, embedded database with similar logical organization

keithalewis · on Aug 13, 2023

Perhaps a more accurate claim would be "SQLite inspired". Calling it a clone is misleading.

Mad props to the author. Many Python programmers never had proper training in computer science, so it is encouraging to see people filling in the gaps of their knowledge.

hluska · on Aug 14, 2023

This is a very early release, whereas SQLite has 22 years of releases. In that light, this is about the least charitable take on this.

Someone in our community built something and had the courage to release it. Your criticism is unfair.

esjeon · on Aug 14, 2023

> Your criticism is unfair.

I think it doesn't even get close to being a criticism, and it's certainly unclear if the goal is to literally clone SQLite or to implement SQLite-ish. This is a fair question.

hluska · on Aug 15, 2023

Why did they have to use the term misleading? Why not be charitable?

keithalewis · on Aug 14, 2023

Just trying to encourage clear and accurate communication. I agree with your first sentence. The only thing unfair is the author claiming it is a SQLite clone. It isn't, as we both seem to agree. It is a form of cheating.

spanspan · on Aug 14, 2023

Fair. It’s “inspired by”, not a “clone”. Frankly, I don’t think these terms are that specific, that one couldn’t level the same point against “inspired by”.. in what sense is it inspired?

spanspan · on Aug 13, 2023

It would be a fun exercise to implement something like TPC-C for learndb and see how this done.

spanspan · on Aug 13, 2023

Cool stuff. I had similar intuitions- Python allow me to focus on the high-level concepts. Albeit, there were times where I wished I had gone with a statically-typed + compiled language.

spanspan · on Aug 13, 2023

Most definitely. The b-tree implementation was the first motivation for starting the project. Especially, all the details around node rebalancing and splitting. And the fact that it was an on-disk structure, added another wrinkle to the thinking about the impl

spanspan · on Aug 18, 2021

This incomplete tutorial is about how to write a sqlite clone, with particular emphasis on how to implement the underlying b-tree: https://cstack.github.io/db_tutorial/

ddlutz · on Aug 18, 2021

for YEARS that tutorial has been stuck on "Alright. One more step toward a fully-operational btree implementation. The next step should be splitting internal nodes. Until then!". Must not be worth the time to finish writing the tutorial.

spanspan · on Aug 18, 2021

This is really cool. I recently attempted to build a toy database, and subsequently implemented my own b-tree ( https://github.com/spandanb/learndb-py). I ended up running into a lot of these issues.

I also did a write-up on why everyone should engage in similar projects: https://www.spandanbemby.com/joys-of-database.html

HN For You