More

arjunnarayan · on Dec 8, 2020

Hi! I'm one of the two authors here. At Materialize, we're definitely of the 'we are a bunch of voices, we are people rather than corp-speak, and you get our largely unfiltered takes' flavor. This is my (and George's from Fivetran) take. In particular this is not Frank's take, as you attribute below :)

> SQL is declarative, reactive Materialize streams are declarative on a whole new level.

Thank you for the kind words about our tech, I'm flattered! That said, this dream is downstream of Kafka. Most of our quibbles with the Kafka-as-database architecture are to do with the fact that that architecture neglects the work that needs to be done _upstream_ of Kafka.

That work is best done with an OLTP database. Funnily enough, neither of us are building OLTP databases, but this piece largely is a defense of OLTP databases (if you're curious, yes, I'd recommend CockroachDB), and their virtues at that head of the data pipeline.

Kafka has its place - and when its used downstream of CDC from said OLTP database (using, e.g. Debezium), we could not be happier with it (and we say so).

The best example is in foreign key checks. It is not good if you ever need to enforce foreign key checks (which translates to checking a denormalization of your source data _transactionally_ with deciding whether to admit or deny an event). This is something that you may not need in your data pipeline on day 1, but adding that in later is a trivial schema change with an OLTP database, and exceedingly difficult with a Kafka-based event sourced architecture.

> Normally you'd have single writer instances that are locked to the corresponding Kafka partition, which ensure strong transactional guarantees, IF you need them.

This still does not deal with the use-case of needing to add a foreign key check. You'd have to:

1. Log "intents to write" rather than writes themselves in Topic A 2. Have a separate denormalization computed and kept in a separate Topic B, which can be read from. This denormalization needs to be read until the intent propagates from Topic A. 3. Convert those intents into commits. 4. Deal with all the failure cases in a distributed system, e.g. cleaning up abandoned intents, etc.

If you use an OLTP database, and generate events into Kafka via CDC, you get the best of both worlds. And hopefully, yes, have a reactive declarative stack downstream of that as well!

vvern · on Dec 8, 2020

> 1. Log "intents to write" rather than writes themselves in Topic A 2. Have a separate denormalization computed and kept in a separate Topic B, which can be read from. This denormalization needs to be read until the intent propagates from Topic A. 3. Convert those intents into commits. 4. Deal with all the failure cases in a distributed system, e.g. cleaning up abandoned intents, etc.

People do do this. I have done this. I wish I had been more principled with the error paths. It got there _eventually_.

It was a lot of code and complexity to ship a feature which in retrospect could have been nearly trivial with a transactional database. I'd say months rather than days. I won't get those years of my life back.

The products were build on top of Kafka, Cassandra, and Elasticsearch where, over time, there was a desire to maintain some amount of referential integrity. The only reason we bought into this architecture at the time was horizontal scalability (not even multi-region). Kafka, sagas, 2PC at the "application layer" can work, but you're going to spend a heck of a lot on engineering.

It was this experience that drove me to Cockroach and I've been spreading the good word ever since.

> If you use an OLTP database, and generate events into Kafka via CDC, you get the best of both worlds.

This is the next chapter in the gospel of the distributed transaction.

gunnarmorling · on Dec 8, 2020

>> If you use an OLTP database, and generate events into Kafka via CDC, you get the best of both worlds.

> This is the next chapter in the gospel of the distributed transaction.

Actually, it's the opposite. CDC helps to avoid distributed transaction; apps write to a single database only, and other resources (Kafka, other databases, etc.) based on that are updated asychronously, eventually consistent.

vvern · on Dec 8, 2020

I mean, I hear you that people do that and it does allow you to avoid needing distributed transactions. When you're stuck with an underlying NoSQL database without transactions, this is the best thing you can do. This is the whole "invert the database" idea that ends up with [1].

I'm arguing that that usage pattern was painful and that if you can have horizontally scalable transaction processing and a stream of events due to committed transactions downstream, you can use that to power correct and easy to reason about materializations as well as asynchronous task processing.

Transact to change state. React to state changes. Prosper.

[1] https://youtu.be/5ZjhNTM8XU8?t=2075

skyde · on Dec 8, 2020

> If you use an OLTP database, and generate events into Kafka via CDC, you get the best of both worlds.

100% agree this is the way to go instead of rolling your own transaction support you get the "ACID" for free from the DB and use KAFKA to archive changes and subscribe to them.

skybrian · on Dec 8, 2020

It seems like Kafka could be used to record requests and responses (upstream of the database) and approved changes (downstream of the database), but isn't good for the approval process itself?

j-pb · on Dec 9, 2020

Hey Arjun, Thanks for taking the time to reply! I didn't mean to suggest that this was Franks opinion, I merely explained to the other user, who seemed to have a more negative attitude towards Materialize, why I am extremely excited about it, namely Frank being the CTO, and him having an extremely good track record in terms of research and code.

I think the general gist of "use an OLTP database as your write model if you don't absolutely know what you're doing" is completely sane advice, however I think there are far more nuanced (and in a sense also honest) arguments that can be made.

I think the architecture you've sketched is over engineered for what you'd need for the task. So here's what I'd build for your inventory example, IFF the inventory was managed for a company with a multi Terra Items Inventory that absolutely NEEDS horizontal scaling:

One event topic in Kafka, partitioned along the ID of the different stores whose inventory is to be managed. This makes the (arguably) strong assumption that inventory can never move between stores, but for e.g. a Grocery chain that's extremely realistic.

We have one writer per store ID partition, which generates the events and enforces _serialisability_, with a hot writer failover that keeps up do date and a STONITH mechanism connecting the two. All writing REST calls / GraphQL mutations for its store-ID range, go directly to that node.

The node serves all requests from memory, out of an immutable Data-structure, e.g. Immutable.js, Clojure Maps, Datascript, or an embedded DB that supports transactions and rollbacks, like SQLite.

Whenever a write occurs, the writer generates the appropriate events, applies them to its internal state, validates that all invariants are met, and then emits the events to Kafka. Kafka acknowledging the write is potentially much quicker than acknowledging an OLTP transaction, because Kafka only needs to get the events into the memory of 3+ machines, instead of written to disk on 1+ machine (I'm ignoring OLTP validation overhead here because our writer already did that). Also your default failure resistance is much higher than what most OLTP systems provide in their default configuration (e.g. equivalent to Postgres synchronous replication).

Note that the critical section doesn't actually have to be the whole "generate event -> apply -> validate -> commit to kafka" code. You can optimistically generate events, apply them, and then invalidate and retry all other attempts once one of them commits to Kafka. However that also introduces coordination overhead that might be better served mindlessly working off requests one by one.

Once the write has been acknowledged by Kafka, you swap the variable/global/atom with the new immutable state or commit the transaction, and continue with the next incoming request.

All the other (reading) request are handled by various views on the Kafka Topic (the one causing the inconsistencies in the article). They might be lagging behind a bit, but that's totally fine as all writing operations have to go through the invariant enforcing write model anyways. So they're allowed to be slow-ish, or have varying QOS in terms of freshness.

The advantage of this architecture is that you have few moving parts, but those are nicely decomplected as Rich Hickey would say, you have use the minimal state for writing which fits into memory and caches, you get 0 lock congestion on writes (no locks), you make the boundaries for transactions explicit and gain absolutely free reign on constraints within them, and you get serialisability for your events, which is super easy to reason about mentally. Plus you don't get performance penalties for recomputing table views for view models. (If you don't use change capture and Materialize already, which one should of course ;] )

The two generals problem dictates that you can't have more than a single writer in a distributed system for a single "transactional domain" anyways. All our consensus protocols are fundamentally leader election. The same is true for OLTP databases internally (threads, table/row locks e.t.c.), so if you can't handle the load on a single 0 overhead writer that just takes care of its small transactional domain, then your database will run into the exact same issues, probably earlier.

Another advantage of this that so far has gone unmentioned is that if allows you to provide global identifiers for the state in your partitions that can be communicated in side-effect-full interactions with the outside world. If your external service allows you to store a tiny bit of metadata with each effect-full API call then you can include the offset of the current event and thus state. That way you can subsume the external state and transactional domain into the transactional domain of your partition.

Now, I think that's a much more reasonable architecture, that at least doesn't have any of the consistency issues. So let's take it apart and show why the general populace is much better served with an OLTP database:

- Kafka is an Ops nightmare. The setup of a Cluster requires A LOT of configuration. Also Zookeper urgh, they're luckily trying to get rid of it, but I think they only dropped it this year, and I'm not sure how mature it is.

- You absolutely 100% need immutable Data-structures, or something else that manages Transactions for you inside the writer. You DO NOT want to manually rollback changes in your write model. Defensive copying is a clutch, slow, and error prone (cue JS, urgh...: {...state}).

- Your write model NEEDS to fit into memory. That thing is the needles eye that all your data has to go through. If you run the single event loop variant, latency during event application WILL break you. If you do the optimistic concurrency variant performing validation checks might be as or more expensive than recomputing the events from scratch.

- Be VERY weary of communications with external services that happen in your write model. They introduce latency, and they break your transactional boundary that you set up before. To be fair OLTPs also suffer from this, because it's distributed consistency with more than one writer and arbitrary invariants which this universe simply doesn't allow for.

- As mentioned before, it's possible to optimistically generate and apply events thanks to the persistent data structures, but that is also logic you have to maintain, and which is essentially a very naive and simple embedded OLTP, also be weary of what you think improves performance vs. what actually improves it. It might be better to have 1 cool core, than 16 really hot ones that do wasted work.

- If you don't choose your transactional domains well, or the requirements change, you're potentially in deep trouble. You can't transact across domains/partitions, if you do, they're the same domain, and potentially overload a single writer.

- Transactional domains are actually not as simple as they're often portrayed. They can nest and intersect. You'll always need a single writer, but that writer can delegate responsibility, which might be a much cheaper operation than the work itself. Take bank accounts as an example. You still need a single writer/leader to decide which account ID's are currently in a transaction with each other, but if two accounts are currently free that single writer can tie them into a single transactional domain and delegate it to a different node, which will perform the transaction and write and return control to the "transaction manager". A different name for such a transaction manager is an OLTP (with Row-Level locking).

- You won't find as many tutorials, if you're not comfortable reading scientific papers, or at least academic grade books like Martin Kleppmanns "Designing Data Intensive Applications" don't go there.

- You probably won't scale beyond what a single OLTP DB can provide anyways. Choose tech that is easy to use and gives you as many guarantees as possible if you can. With change capture you can also do retroactive event analytics and views, but you don't have to code up a write-model (and associated framework, because let's be honest this stuff is still cutting edge, and really shines in bespoke custom solutions for bespoke custom problems).

Having such an architecture under your belt is a super useful tool that can be applied to a lot of really interesting and hard problems, but that doesn't mean that it should be used indiscriminately.

Afterthought; I've seen so many people use an OLTP but then perform multiple transactions inside a single request handler, just because that's what their ORM was set to. So I'm just happy about any thinking that people spend on transactions and consistency in their systems, in whatever shape or form, and I think making the concepts explicit instead of hiding them in a complex (and for many unapproachable) OLTP/RDBMS monster helps with that (if Kafka is less of a monster is another story).

I think it's also important to not underestimate the programmer convenience that working with language native (persistent) data-structures has. The writer itself in its naive implementation is something that one can understand in full, and not relying on opaque and transaction breaking ORMs is a huge win.

PS: Plz start a something collaboratively with ObservableHQ, having reactive notebook based dashboards over reactive Postgres queries would be so, so, so, so awesome!

arjunnarayan · on Nov 2, 2020

Materialize | Engineers | Marketing | NYC HQ + North America Remote + EU Remote in early 2021 | http://materialize.io/careers

Materialize is a streaming database for real-time applications. Materialize lets you ask questions about your data, and then get low-latency, correct answers, which are kept incrementally updated as the underlying data changes.

Materialize is built on Timely Dataflow, a low-latency cyclic dataflow computational model, first introduced in the paper "Naiad: a timely dataflow system". Materialize is co-founded by Frank McSherry, the primary author of Timely Dataflow (http://timelydataflow.com) and Differential Dataflow (http://differentialdataflow.com), the two open source projects that power Materialize. Materialize itself is source-available and entirely written in Rust: https://github.com/MaterializeInc/materialize

Materialize is a team of over twenty, primarily based in New York City but also open to remote positions. We are hiring in all engineering positions (eng. manager, engineers from new grad to principal) as well as several non-engineering positions. For the full list, see http://materialize.io/careers

We are a team of significantly experienced individuals in databases and distributed systems, and looking to add more folks with that interest and/or experience to our team.

postit · on Nov 3, 2020

Frank is on my top 5 list of minded individuals I’d love to work with. Applying.

rudyruiz · on Nov 3, 2020

The `Open Positions` section in the careers page returns a `Error fetching jobs` message. Is there any other place where I can take a look at positions and apply?

avp55 · on Nov 2, 2020

Are any of the 'Software Engineer' positions available as a Remote Option? I don't see them listed as such, but thought I'd ask anyhow.

elbear · on Nov 5, 2020

I live in Europe. Does that mean I should wait until 2021 to apply?

vaidhy · on Nov 2, 2020

I got some questions. What would be the best way to get in touch?

arjunnarayan · on Nov 2, 2020

If you're interested in a job, apply and talk to us, that's by far the best way! If you just want to chat - my profile has a variety of ways to contact me as well.

darknessmonk · on Nov 5, 2020

No love for SA remote? :(

arjunnarayan · on Oct 16, 2020

The reasoning is that as you gain information, you also have a duty to the people in the control group to use the best available information to take care of their health. Once you gain "enough" information ("enough" being statistically defined) that the drug helps, each additional person you let languish in the control group (who is denied access to the drug) is a cost that must be weighed against the benefit of getting additional information. When the data is clear enough, the cost can exceed the benefit, and you stop early.

Typically you'd register a "stopping rule" before you start your trial: a good drug often will trigger the stopping rule, as it helped so much that we learned about its efficacy on a smaller N than originally planned.

There are many different stopping criteria, depending on the trial (in safety trials you'd typically stop because you've found evidence that the drug is unsafe and continuing would be unfair to the folks in the treatment group, whereas in efficacy trials after safety has been established, you'd stop because you've found evidence that the drug is effective and continuing would be unfair to the folks in the control group).

Siira · on Oct 16, 2020

The reasoning is money, the unreasoning is ethical sophistry.

arjunnarayan · on Oct 1, 2020

Materialize | Engineers | Marketing | NYC and North America Remote | http://materialize.io/careers

Materialize is a streaming database for real-time applications. Materialize lets you ask questions about your data, and then get low-latency, correct answers, which are kept incrementally updated as the underlying data changes.

Materialize is built on Timely Dataflow, a low-latency cyclic dataflow computational model, first introduced in the paper "Naiad: a timely dataflow system". Materialize is co-founded by Frank McSherry, the primary author of Timely Dataflow (http://timelydataflow.com) and Differential Dataflow (http://differentialdataflow.com), the two open source projects that power Materialize. Materialize itself is source-available and entirely written in Rust: https://github.com/MaterializeInc/materialize

Materialize is a team of over twenty, primarily based in New York City but also open to remote positions. We are hiring in all engineering positions (eng. manager, engineers from new grad to principal) as well as several non-engineering positions. For the full list, see http://materialize.io/careers

We are a team of significantly experienced individuals in databases and distributed systems, and looking to add more folks with that interest and/or experience to our team.

zmer · on Oct 2, 2020

This posting says North America remote, but the site says US Remote. Can you clarify which it is?

carterschonwald · on Oct 2, 2020

I had some genuinely great chats with Arjun last summer, though life got busy and i didn't followup at the time. I have no doubt that its a fantastic group of colleagues to work with though.

kiwih · on Oct 2, 2020

How strict are the requirements for the marketing positions? Do you have something suitable for entry-level / recent graduate?

wtroughton · on Oct 1, 2020

Should engineers from new grads to senior levels apply under Principal Engineer listing?

arjunnarayan · on Oct 1, 2020

I fixed the listing, thanks for pointing that out!

cirego · on Oct 1, 2020

Please do!

geekboy81 · on Oct 5, 2020

Nice! they're using Rust

arjunnarayan · on Aug 15, 2020

Yes, we do! It's a little non-standard SQL syntax we added called TAIL. You write TAIL <viewname>, and you get changes pushed to you. You can see a video of me badly explaining it in [2]. You can also do the thing where you create a Kafka SINK of a view and have the system push this to Kafka rather than have a long-running open SQL connection. There's some fun stuff with TAIL AS OF (start the changes at a specific point in time) and TAIL WITH SNAPSHOT (run a SELECT query, and then begin the change stream transactionally with the time of that SELECT query) that you can read about in the docs[1].

[1] https://materialize.io/docs/sql/tail/

[2] https://youtu.be/9XTg09W5USM?t=2650

arjunnarayan · on Aug 4, 2020

I don't believe Metabase supports the Calcite SQL dialect (https://github.com/metabase/metabase/issues/6230), which is what Crux is using for the SQL layer. So I believe the answer is no - but I'm not an expert here, so don't take this answer as definitive.

refset · on Aug 4, 2020

As it happens there was a comment on a related issue about Dremio support [0] (which also uses Calcite) where someone shared that they got a Dremio driver working. I was able to fork their driver and get Metabase working against Crux in time for a live demo of the crux-sql module back in May [1].

I just switched the forked driver repo to public if anyone wants to test it out [2]. The Metabase driver docs are pretty straightforward to get things running. There's definitely work to be done though and I didn't get very deep into it really - but I hope to pick it up again soon!

[0] https://github.com/metabase/metabase/issues/5562

[1] https://youtu.be/StXLmWvb5Xs?t=996

[2] https://github.com/crux-labs/crux-metabase-driver

(I work on Crux :)

EDIT: it's also worth a mention that the linked issue #6230 discusses a lot of problems deriving from Druid's lack of support for prepared statements, but Crux and Dremio don't have that limitation. Although reading the most recent comment it looks like Druid may have overcome that hurdle now, so with a bit of luck there might now be more traction to get mainline Metabase support for a generic Calcite driver!

tsm · on Aug 4, 2020

You can write your own driver for Metabase without _too_ much hassle: https://github.com/metabase/metabase/wiki/Writing-A-Driver

arjunnarayan · on July 14, 2020

The argument is that without stronger consistency guarantees you can't do joins between two streams (or even something like argmax over a single stream, since it splits the stream into two subcomputations, which then have to be joined back together).

I think when folks say that eventual consistency is okay, they're thinking about simple aggregates - where transient incorrectness in the result is indistinguishable from noise.

But if you want to do joins, you really want to be able to reason about your unbounded streams causally - Flink, Beam, (and as another commenter points out, Firebase as well) provide stronger consistency guarantees on computations over unbounded streams.

PeterCorless · on July 14, 2020

This, of course, presumes that ppl want to do JOINs. Wouldn't eventual consistency be mostly NoSQL where joins are not an issue?

arjunnarayan · on June 23, 2020

Jay Kreps is an extremely well read man; the reference is surely ironic. Enterprise middleware, after all, is and always has been a bureaucratic rigmarole.

The nightmare has existed since the first ETL job was written, but by democratizing pub/sub access throughout companies, we can't help but be horrified by what we see going on in the dataplane. Apache Kafka, rather than creating this nightmare, brings light to it, just as Franz Kafka's literature brought light to the absurdity of life under evil bureaucracies.

Now he can't quite say this outright lest he offend some set of his customers who have to live in this reality, so we have that vague quote about "he's one of my favorite writers", but surely we can take a direct literary reference as a wink and a nod and read between the lines?

oblio · on June 23, 2020

> evil bureaucracies

I think Kafka's point is that those bureaucracies are not evil, just uncaring. Which makes the end results even more sinister, as you get crushed through no clear malicious act.

greatdesc · on June 23, 2020

This is a great description. Sums up well my thoughts about Kafka but that I hadn't verbalised.

arjunnarayan · on June 15, 2020

I think a 5-second summary of the difference is that Materialize is Oracle FAST REFRESH for almost* all of SQL, as opposed to just those very limited cases where Oracle's FAST REFRESH works.

Oracle's docs have entire features in order to help you diagnose whether your views are eligible for fast or synchronous refresh[1]. In Materialize, everything you can express in SQL is fast refreshed. No caveats.

[1] https://docs.oracle.com/database/121/DWHSG/sync.htm#DWHSG914...

*Almost, because the surface area is truly astonishing.

arjunnarayan · on May 23, 2020

Thanks for the vote of confidence! One thing: We're not marketing it in the OLAP space. Our existing users very much are building new applications.

Initially we went for the metaphor of "what if you could keep complex SQL queries (e.g. 6-way joins and complex aggregations, the kinds of queries that today are essentially impossible outside a data warehouse) incrementally updated in your application within milliseconds? What would you build?

We're moving away from that metaphor because it seems it's more confusing than helpfuL. Tips always appreciated!

jacobobryant · on May 23, 2020

Ah, thanks for the correction. In any case I'm looking forward to trying it out eventually--got a number of other things ahead in the queue though.

My suggestion would be consider comparing it to firebase queries. Firebase devs are already familiar with how incrementally updated queries can simplify application development a lot. But, despite Firebase's best marketing attempts, the queries are very restrictive compared to sql or datalog.

HN For You