More

gulcin_xata · 2026-01-15T09:18:37 1768468717

For many production setups, taking a database snapshot involves transferring significant amounts of data over the network. The standard way to do this efficiently is to process data in batches. Batching reduces per-request overhead and helps maximize throughput, but it also introduces an important tuning problem: choosing the right batch size.

A batch size that works well in a low latency environment can become a bottleneck when snapshots run across regions or under less predictable network conditions. Static batch size configuration assumes stable networks, which rarely reflects reality.

In this blog post we describe how we used automatic batch size tuning to optimize data throughput for Postgres snapshots, the constraints we worked under and how we validated that the approach actually improves performance in production-like environments for our open source pgstream tool.

gulcin_xata · 2025-09-16T07:55:43 1758009343

Thanks for the feedback!

gulcin_xata · 2025-09-16T07:52:53 1758009173

Ah, thanks for the sharing this insight.

gulcin_xata · 2025-05-19T10:47:02 1747651622

I agree, it is not so straightforward to find out.

gulcin_xata · 2025-05-14T13:33:04 1747229584

Hey!

(Disclaimer: I work at Xata.) Just wanted to mention that we also support anonymization, in case that’s something you're looking into: https://xata.io/postgres-data-masking

gulcin_xata · on April 3, 2025

Recently we launched Xata Agent, an open-source AI agent which helps diagnose issues and suggest optimizations for PostgreSQL databases.

To make sure that Xata Agent still works well after modifying a prompt or switching LLM models we decided to test it with an Eval. In this blog, we'll explain how we used Vercel's AI SDK and Vitest to build an Eval in TypeScript.

gulcin_xata · on Jan 11, 2025

Tried with a video I previously watched and got a good enough summary, also liked the UI. Did you implement a RAG?

GentleNapkin · on Jan 11, 2025

thanks! Cool that you like the UI since I feel like I'm not very 'artistic' if that makes sense.

No, no RAG. I use Gemini 1.5 Flash as an LLM, and it has a very long context window (1M tokens). Because of that, I can feed the entire transcript into Gemini's context. I feel that's important to get good results.

gulcin_xata · on Dec 16, 2024

I see you're using pg-schema-diff for schema diffing, hadn’t come across it before, so thanks for mentioning it!

Have you seen pgroll? https://github.com/xataio/pgroll It is a Postgres schema migration tool for achieving zero downtime, minimal locking schema changes. Thought it might be interesting for you. I also checked the unsupported operations in pg-schema-diff, and from a quick look, pgroll seems to cover more migration types: https://pgroll.com/docs/v0.8.0/getting-started

gulcin_xata · on Dec 15, 2024

Have you seen pgstream? https://github.com/xataio/pgstream It is similar to pg_replicate and could be a good fit for streaming CDC data from Postgres. There is no built-in output plugin specifically for DuckDB but it might help you for building something lightweight and custom for your use case.

Binomial-Dist · on Dec 16, 2024

Thanks, I'll take a look!

gulcin_xata · on Sept 24, 2024

Is this an open-source product? How can we give feedback?

pihcys · on Sept 24, 2024

It is not an open-source product at this time. However, I welcome any feedback on the product idea or suggestions for survey software features.

HN For You