For the best experience on desktop, install the Chrome extension to track your reading on news.ycombinator.com
Hacker Newsnew | past | comments | ask | show | jobs | submit | history | atwong's commentsregister

Hmm... US has 330 million. JP has 120 million in the size of california. They have a long way to go.


One company looking to capitalize on this is Onehouse, a three-year-old Californian startup founded by Vinoth Chandar, who created the open source Apache Hudi project while serving as a data architect at Uber. Hudi brings the benefits of data warehouses to data lakes, creating what has become known as a “data lakehouse,” enabling support for actions like indexing and performing real-time queries on large datasets, be that structured, unstructured, or semi-structured data.


The idea of running workloads on the cheapest compute platform is interesting.


Location: Irvine, CA

Remote: Preferred, but Hybrid fine within 20 miles of Irvine, CA

Willing to Relocate: No

Technologies: Strongest in Java, Kubernetes, MongoDB, SQL, Kafka, Data Lakehouse (Wrote Java apps as a developer lead at IBM, was on the 0.1 kubenetes team at Red Hat, 3+ years at MongoDB leading the technical direction at their #1 ARR account WW, Head of Community and DevRel at a 65 million VC-funded startup that was building an open source version of Snowflake)

Resume: http://linkedin.com/in/atwong

Email: atwong@alumni.uci.edu

About me:

My game is changing the world at warp speed. I've honed my skills across the board - sales engineer, developer advocate, lead developer, even infrastructure and operations maestro. Think of me as a multi-class strategist with 10 patents under my belt and 20+ certifications in my arsenal – Java, .net, you name it. I can build it, deploy it, talk the talk, and walk the walk, from bare metal to cloud-native code. So, what's the next challenge we'll tackle together?

I am looking for my next job in sales engineering or developer relations.


Data-warehousing becomes ubiquitous and open-source with Iceberg. Powerful open-source RTA (real-time-analytics) engines like ClickHouse are designed for crunching numbers as fast as possible. How can you use Iceberg and ClickHouse together to burn the rubber over glacially slow storage?


Hm... I'm not sure if I see anything more than I can get from https://ossinsight.io/analyze/StarRocks/starrocks#overview


ossinsight has comparison and collection tools that are also worth checking out too.


intake of 450,000 entries per second, and a daily data volume of 12TB

* Query performance improved 4x. * Query P90 time improved from 500ms to 150ms. * The cluster size reduced from 60+ nodes to less than 10 nodes. * Data storage volume reduced by about 40%. * Overall cost reduced by more than 80%.


It is my belief that data lakehouse analytics will replace data warehouse analytics in the future. Data lakehouses offer a number of advantages over traditional data warehouses


Before we start, what is TPC-H and TPC-DS and why are they important?

TPC-H and TPC-DS are important because they are industry standard benchmarks for measuring the performance of data warehouse and big data systems. They are widely used by vendors and customers to evaluate the performance of different systems and to compare the performance of the same system over time.


It only works with Clickhouse... ugh.


> The program is installed with clickhouse-client, has no dependencies, and works on almost any flavor of Linux. You can apply it to any database dump, not just ClickHouse. For instance, you can generate test data from MySQL or PostgreSQL databases or create development databases that are similar to your production databases.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search:

HN For You