Excel has quietly implemented one of the most elegant incremental build systems in software, a reactive DAG engine that updates just what it needs to. We’re building Preswald, a reactive Python framework for data dashboards, and realized Excel solves many of the same problems. This post breaks down what spreadsheets can teach us about DAG runtimes, reactivity, and incremental computation.
CSVs seem like a great idea until they aren't. They're simple, portable, easy to open. No setup, no database, no friction. Just raw data, right there. That’s why people love them. But the moment they get big—really big—everything breaks. Excel crashes. Pandas eats all your RAM. Even VS Code freezes up. Suddenly, what was supposed to be the easiest format becomes the hardest to work with.
The problem is, CSVs don’t scale. No indexing means every search is a full scan. No structure means every query is brute force. A 5GB CSV isn’t just 5GB—it’s 15GB in RAM once it’s loaded, maybe more. If you don’t have the memory, your system starts swapping, and everything slows to a crawl. Sorting? Painful. Joins? Basically impossible. The tools we use weren’t built for this, but we keep using them anyway because, well, what else is there?
We’re used to thinking of software as something permanent: bought, maintained, and scaled. But what if it didn’t have to be? The cost of writing code has plummeted, making it easier than ever to build ephemeral software (tools created for a specific need, used for a short time, then discarded). Instead of forcing rigid, bloated SaaS into every workflow, individuals can now spin up lightweight, hyper-personalized apps that do exactly what’s needed—nothing more, nothing less.
Debugging AI voice agents is way easier when you have the right dashboards. With DuckDB & Structured, you can pull logs locally, run instant SQL queries, and turn messy data into clear, interactive dashboards. This post walks through setting up a real-time AI debugging dashboard to spot ASR errors, intent misclassifications, and slow responses, so you can fix issues fast and improve performance.
We recently explored Ibis, a Python library designed to simplify working with data across multiple storage systems and processing engines. It provides a DataFrame-like API, similar to Pandas, but translates Python operations into backend-specific queries. This allows it to work with SQL databases, analytical engines like BigQuery and DuckDB, and even in-memory tools like Pandas. By acting as a middle layer, Ibis addresses challenges like fragmented storage, scalability, and redundant logic, enabling a more consistent and efficient approach to multi-backend data workflows. Wrote up some learnings
Analytics success depends on asking the right questions upfront. Many projects fail because they start with vague goals like “We need a dashboard” without understanding the problem they’re solving. Analytics spans a spectrum from descriptive (summarizing past events) to prescriptive (recommending actions), and its design should match the decisions it enables, the users it serves, and the constraints it operates under. Misalignment—choosing the wrong type of analytics or tools—leads to wasted efforts. By focusing on purpose, audience, and context, organizations can craft analytics systems that guide meaningful decisions instead of merely generating data.
Analytics evolves across four levels: descriptive (what happened), diagnostic (why it happened), predictive (what will happen), and prescriptive (what to do about it). Each level requires different tools and capabilities, from data cleaning and interactivity to model explainability and operational integration. Delivery methods—real-time, batch, embedded, or ad-hoc—must align with the specific job, whether monitoring systems, diagnosing issues, or predicting trends. Success lies in fit: understanding users, the job at hand, and constraints like scale or compliance. The best analytics systems solve targeted problems exceptionally well, while the worst attempt to be one-size-fits-all.
TensorFlow organizes computations as a DAG: nodes represent operations, and edges define their dependencies. This structure enforces order, ensures reproducibility, and provides transparency. Operations only run when their dependencies are resolved, and the graph guarantees consistent outputs from the same inputs.
Now think about notebooks. Cells are like operations in a graph, but without enforced dependencies or execution order. Hidden dependencies and unpredictable execution often lead to messy, unreliable workflows.
What if we applied DAG principles to notebooks? Cells could become nodes in a graph, with explicit dependencies and predictable execution. A preprocessing cell, for instance, would depend on the data-loading cell. Change one, and downstream cells automatically update.
Imagine if building data apps was 100x easier—what would change? In the 1990s, photography was deliberate and scarce. Film was expensive. Every photo was a decision. Fast forward to today: with smartphones, photography is instant, cheap, and everywhere. We don’t just capture special occasions anymore—we capture everything.
This same transformation is possible for data apps. Right now, building even simple data apps is complex:
Setting up connectors to pull data
Writing SQL to clean and model it
Engineering a frontend and backend
Deploying and maintaining the stack
The result? Data apps today are reserved for the most critical use cases. No one spends $50K to build a temporary dashboard for a weekend festival or a hyper-local app for a single restaurant. The ROI isn’t there.
But what happens when building data apps costs 1/100th of what it does today?
A restaurant manager could quickly analyze food waste patterns, broken down by dish, time, and chef.
A festival organizer could track foot traffic, vendor sales, and bathroom wait times in real time.
An individual sales rep could spin up a live dashboard to track their own deal pipeline.
What new behaviors and ecosystems would emerge if building data apps is this fast and cheap?
What we learned from building a no-code data stack. And why we changed course. How AI copilots solve what no-code apps simply cannot.
At first, we set out to solve a problem that seemed, at first glance, straightforward. Businesses have data locked in different systems: CRMs, product analytics platforms, billing tools, and more. To answer even basic business questions, they needed to extract and combine data from these silos.
Our approach was ambitious. We wanted to build a tool that handled every step of the data pipeline — data ingestion, modeling, schema mapping, metric definitions, and visualization — all through an intuitive and simple no-code SaaS interface. Our vision was to empower analysts, ops teams, and business users to do what once required a full data engineering team.
But, as we built and iterated and onboarded use-cases, we uncovered deep structural flaws in this approach. These learnings have been pivotal in shaping our current belief that low-code/no-code solutions are fundamentally misaligned with the challenges of real-world data complexity.
We learned that Code-first tools, with the right scaffolding, allow you to embrace complexity without being overwhelmed by it. With AI copilots like Preswald, code-first architectures, and unified control planes, we can finally create data stacks that are both powerful and approachable.