For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | hackerzr's commentsregister

Thoughts on this?

Pinecone just published a technical deep-dive into how they're redesigning their vector database architecture to handle three increasingly common workloads:

- Recommender systems requiring 1000s of QPS - Semantic search across billions of documents - Agentic systems with millions of independent agents operating simultaneously

Among other things, a "log structured indexing" approach uses immutable "slabs" to balance freshness and performance. Writes go to in-memory memtables that flush to blob storage as L0 slabs using fast indexing (scalar quantization/random projections), while background compaction creates larger slabs with more intensive partition/graph-based indexes.

This design solves a few issues: It enables high freshness for all workloads (including recommenders) It supports both graph-based and other indexing approaches in the same system It eliminates the traditional build/serve split for recommender workloads It provides predictable caching between local SSD and memory

They're also introducing disk-based metadata filtering using bitmap indices adapted from data warehouses, which helps with high-cardinality filtering use cases like access control lists.

What do you think?


An engineer at Coralogix, a full-stack observability platform, recently shared an intriguing solution to translating SQL expressions with null semantics into OpenSearch DSL. This challenge arose while building the DataPrime query language and engine, which needed to maintain backwards compatibility with OpenSearch.

Key points:

The engineer confronted the disparity between SQL's three-valued logic (TRUE, FALSE, NULL) and OpenSearch DSL's binary filter system.

They devised a method to reduce three-valued logic to two-valued logic by considering the context of expressions, such as WHERE clauses.

The solution introduces is_false_or_null() and is_true_or_null() functions to bridge SQL and OpenSearch DSL. Boolean operators are handled by analyzing truth tables and deriving corresponding OpenSearch DSL translations. This approach enables the translation of complex SQL expressions to OpenSearch DSL while maintaining correct null semantics. The team implemented additional optimizations on the intermediate representation to enhance query efficiency.

This solution allows Coralogix users to seamlessly query both Parquet files and OpenSearch using a unified query language. The article offers valuable insights for developers working on query engines or database compatibility layers, demonstrating a creative approach to a common challenge in data querying and observability platforms.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You