Planet PostgreSQL

Understanding Split-Brain Scenarios in Highly Available PostgreSQL Clusters

Posted by semab tariq in Stormatics on 2025-05-30 at 10:49

High Availability (HA) refers to a system design approach that ensures a service remains accessible even in the event of hardware or software failures. In PostgreSQL, HA is typically implemented through replication, failover mechanisms, and clustering solutions to minimize downtime and ensure data consistency. Hence, HA is very important for your mission-critical applications.

In this blog post, we will try to explore a critical failure condition known as a split-brain scenario that can occur in PostgreSQL HA clusters. We will first see what split-brain means, and then how it can impact PostgreSQL clusters, and finally discuss how to prevent it through architectural choices and tools available in the PostgreSQL ecosystem

What is a Split-Brain Scenario?

A split-brain scenario occurs when two or more nodes in a cluster lose communication with each other but continue operating as if they are the primary (or leader) node.

We know that the standby node receives data from the primary in real time. However, if a network glitch occurs between the primary and the standby, and the standby is unable to receive data for a certain timeout period, it may assume that the primary has failed.

As a result, the standby might promote itself to a new primary. But since the issue was only a temporary network problem, the original primary is still active. This leads to a situation where two primary nodes are accepting writes at the same time, which is a highly dangerous state for the cluster.

Split-brain is particularly dangerous because it breaks the fundamental guarantees of consistency in an HA cluster. Writes can happen independently on multiple primary nodes, and reconciling those changes later is often impossible without data loss or manual intervention.

Common Scenarios That Lead to Split-Brain

As we have seen earlier, the most common cause of split-brain is a network failure between the primary and standby nodes. However, there are several other situations where a cluster might also encounte

[...]

Bridged Indexes in OrioleDB: architecture, internals & everyday use?

Posted by Alexander Korotkov on 2025-05-30 at 00:00

Since version beta10 OrioleDB supports building indexes other than B-tree. Bridged indexes are meant to support these indexes on OrioleDB tables.

1. Why OrioleDB needs a “bridge”

OrioleDB stores its table rows inside a B-tree built on a table primary key and keeps MVCC information in an undo log, so it can’t simply plug PostgreSQL’s existing Index Access Methods (GiST, GIN, SP-GiST, BRIN, …) into that structure. While PostgreSQL's Index Access Methods:

reference a 6-byte ctid (block number and offset in the heap) -- not a logical key;
keep every live version of a row in the index, leaving visibility checks to the executor;
support inserts only in the index and rely on VACUUM for physical deletion.

OrioleDB indexes, in contrast, are MVCC-aware: they point to the rows via primary-key values and support logical updates/deletes directly in the index. To remain heap-free while still allowing users build the rich ecosystem of non-B-tree indexes, OrioleDB introduces a bridge index layer.

Illustration of a bridge index

2. How the bridge works under the hood

Virtual iptr column -- an incrementally increasing "index pointer" automatically added to the table. The new value of iptr is assigned each time any column referenced by a bridged index is updated, ensuring the pointer remains stable for the indexed data.
Bridge index -- a lightweight secondary index that maps iptr to primary-key value. It behaves like a normal OrioleDB secondary B-tree, except it doesn't use undo log for MVCC.
PostgreSQL indexes (GIN/GiST/...) are built on the iptr values instead of ctids, so their structure stays compatible with the IndexAM API. During scans, the engine looks up iptr, translates it through the bridge index, and then fetches the row by primary key.
The vacuum process collects stale iptr-s that are not visible to any snapshot, and asks the underlying IndexAM to clean up; then physically deletes the same pointers from the bridge index.

The result is a tri-level lookup path: Inde

[...]

Postgres Extensions: Use PG_MODULE_MAGIC_EXT

Posted by David Wheeler in Tembo on 2025-05-29 at 22:09

A quick note for PostgreSQL extension maintainers: PostgreSQL 18 introduces a new macro: PG_MODULE_MAGIC_EXT. Use it to name and version your modules. Where your module .c file likely has:

PG_MODULE_MAGIC;

Or:

#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif

Change it to something like:

#ifdef PG_MODULE_MAGIC_EXT
PG_MODULE_MAGIC_EXT(.name = "module_name", .version = "1.2.3");
#else
PG_MODULE_MAGIC;
#endif

Replace the name of your module and the version as appropriate. Note that PG_MODULE_MAGIC was added in Postgres 8.2; if for some reason your module still supports earlier versions, use a nested #ifdef to conditionally execute it:

#ifdef PG_MODULE_MAGIC_EXT
PG_MODULE_MAGIC_EXT(.name = "module_name", .version = "1.2.3");
#else
#ifdef PG_MODULE_MAGIC
PG_MODULE_MAGIC;
#endif
#endif

If you manage the module version in your Makefile, as the PGXN Howto suggests, consider renaming the .c file to .c.in and changing the Makefile like so:

Replace .version = "1.2.3" with .version = "__VERSION__"
Add src/$(EXTENSION).c to EXTRA_CLEAN

Add this make target:

src/$(EXTENSION).c: src/$(EXTENSION).c.in
        sed -e 's,__VERSION__,$(EXTVERSION),g' $< > $@

If you use Git, add /src/*.c to .gitignore

For an example of this pattern, see semver@3526789.

That’s all!

Tip: Put your Rails app on a SQL Query diet

Posted by Andrew Atkinson on 2025-05-29 at 17:29

Introduction

Much of the time taken processing HTTP requests in web apps is processing SQL queries. To minimize that, we want to avoid unnecessary or duplicate queries, and generally perform as few queries as possible.

Think of the work that needs to happen for every query. The database engine parses it, creates a query execution plan, executes it, and then sends the response to the client.

When the response reaches the client, there’s even more work to do. The response is transformed into application objects in memory.

How do we see how many queries are being created for our app actions?

Count the queries

When doing backend work in a web app like Rails, monitor the number of queries being created directly, by the ORM, or by libraries. ORMs like Active Record can generate more than one query from a given line of code. Libraries can generate queries that are problematic and may be unnecessary.

Over time, developers may duplicate queries unknowingly. These are all real causes of unnecessary queries from my work experience.

Why are excessive queries a problem?

Why reduce the number of queries?

Besides parsing, planning, executing, and serializing the response, the client is subject to a hard upper limit on the number of TCP connections it can send to the database server.

In Postgres that’s configured as max_connections. The application will have a variable number of open connections based on use, and its configuration of processes, threads and its connection pool. Keeping the query count low helps avoid exceeding the upper limit.

What about memory use?

What about app server memory?

With Ruby on Rails, the cost of repeated queries is shifted because the SQL Cache is enabled by default, which stores and serves results for matching repeated queries, at the cost of some memory use.

As an side, from Rails 7.1 the SQL Cache uses a least recently used (LRU) algorithm. We can also configure the max number of queries to cache, 100 by default, to control how much

[...]

Don't mock the database: Data fixtures are parallel safe, and plenty fast

Posted by Brandur Leach in Crunchy Data on 2025-05-29 at 13:00

The API powering our Crunchy Bridge product is written in Go, a language that provides a good compromise between productivity and speed. We're able to keep good forward momentum on getting new features out the door, while maintaining an expected latency of low double digits of milliseconds for most API endpoints.

A common pitfall for new projects in fast languages like Go is that their creators, experiencing a temporary DX sugar high of faster compile and runtime speeds than they've previously encountered in their career, become myopically focused on performance above anything else, and start making performance optimizations with bad cost/benefit tradeoffs.

The textbook example of this is the database mock. Here's a rough articulation of the bull case for this idea: CPUs are fast. Memory is fast. Disks are slow. Why should tests have to store data to a full relational database with all its associated bookkeeping when that could be swapped out for an ultra-fast, in-memory key/value store? Think of all the time that could be saved by skipping that pesky fsync, not having to update that plethora of indexes, and foregoing all that expensive WAL accounting. Database operations measured in hundreds of microseconds or even *gasp*, milliseconds, could plausibly be knocked down to 10s of microseconds instead.

Mock everything, test nothing

Anyone who's substantially journeyed down the path of database mocks will generally tell you that it leads nowhere good. They are fast (although disk speed has improved by orders of magnitude over the last decade), but every other one of their aspects leaves something to be desired.

A fatal flaw is that an in-memory mock bears no resemblance to a real database and the exhaustive constraints that real databases put on input data. Consider for example, whether a mock would fail like a database in any of these scenarios:

A value is inserted for a column that doesn't exist.
A value of the wrong data type for a column is inserted.
Duplicate values are inse

[...]

Advanced Patch Feedback Session (APFS) at pgconf.dev 2025

Posted by Tomas Vondra on 2025-05-29 at 12:00

The pgconf.dev conference, a revamp of the original PGCon, happened about two weeks ago. It’s the main event for Postgres developers, and one of the things we’re trying is an Advanced Patch Feedback Session (APFS).

We first tried that last year in Vancouver, and then again in Montreal. But I realized many people attending the conference either are not aware of the event at all, or are not sure what it’s about. So let me explain, and share some reflections from this year.

PostgreSQL Europe Diversity Task Force (Deep Dive and Updates)

Posted by Karen Jex in Crunchy Data on 2025-05-28 at 17:05

I'm sharing the slides and transcript from my talk about the PostgreSQL Europe Diversity Task Force at PostgreSQL Development Conference 2025.

It's an extended version of the 5 minute lightning talk that I gave at FOSDEM PGDay earlier this year, with some updates on what we've achieved.

If you want a shorter read, feel free to check out the annotated slides from the lightning talk instead!

PostgreSQL Europe Diversity Task Force PGConf.dev 2025, Montreal

You may or may not already know that that PostgreSQL Europe (PGEU) launched a Diversity Task Force last year.

But why have we created the diversity task force?
Who’s working on it?
What’s being done?
How can you get involved?

and, maybe more importantly, how can we help you?

Diversity definition from Link in text below

Firstly, and briefly, what does diversity actually mean?

This poor little word has unfortunately had some bad press recently but, in a nutshell, it just means variety.

Especially

“The inclusion of people of different races, cultures etc. in a group or organisation”

List of different types of diversity - ethnicity, gender, cultural background, language, age, race, marital/parental status, education, socioeconomic background, worldview, LGBTQIA+, religion, neurodiversity, disability, knowledge/experience

Gender and race are often the first things that spring to mind when people talk about diversity,

and of course, those are important.

But diversity is also about:

your work experience and current job role,
how and where you grew up,
the language you speak,
whether or not you have children or other caring responsibilities,
any physical disabilities or neurological differences you may have,
your educational background,
and much, much more.

This is far from an exhaustive list!

There are so many different things that bring variety to our community, and so many attributes that make each of us unique.

extract from pgconf.dev schedule advertising stacey haysler's talk 'Everything You Need to Know About Diversity in 25 Minutes'. Link below image

I encouraged the audience to go to Stacey Haysler's talk Everything You Need to Know About Diversity in 25 Minutes later the same day.

[The slides are available on the conference website.]

Why do we have a diversity task force?

It’s no secret that there are challenges in terms of diversity across the entire tech industry, and the Postgres

[...]

pgroll 0.13.0: Start using pgroll on existing databases with the new baseline feature

Posted by Andrew Farries in Xata on 2025-05-27 at 09:00

The new baseline feature in pgroll 0.13.0 makes it easy to start using pgroll on databases with existing schema and keep your schema histories clean.

Prairie Postgres Developers Summit

Posted by Henrietta Dombrovskaya on 2025-05-27 at 01:38

Even before the Community Summit in Montreal, I wanted to organize a Developers Summit at one of our meetups. My inspiration came from the Open spaces at DevOps Day Chicago. I saw how productive the smaller group discussions could be, and wanted to try to discuss different aspects of interactions.

We finally made it happen on May 22! In the meetup announcement, I asked those signing up to vote for topics we wanted to suggest and name additional topics. Then, when a meetup started, we re-voted the topics, and same as in Montreal, the outcome was different :).

I talked with several meetup members about leading group discussions, so we were prepared. However, a person who volunteered to lead the AI discussion ended up not making it to the meetup, so those who wanted to discuss AI had to choose a different topic. We ended up with three discussions:

Using ORM with Postgres
Application-driven design
Monitoring Postgres applications

The discussions lasted about 45 minutes, and I had to force-end them. Below are the summaries provided by the discussion leads, and a recording is coming soon!

Doğaç Eldenk, Application-driven design

In this discussion, we have talked about common table design strategies to increase performance and enrich developer experiences. We’ve talked about adding standardized columns such as “created_at” and “updated_at” to record some metadata about the row. Those columns are convenient for ETL jobs and developers. We discussed whether those should be assigned by the application or the database using default values or triggers; we mostly favored database-managed values for uniformity and control. Moreover, we went through unique identifiers, IDs, UUID v4, and time-based UUID v7, and how they can benefit the developer and the database. Also, semi-structured data is pretty common in applications; for this reason, using JSON columns is a good option. For semi-structured repeated data, you can use array type over normalization of data into a separate table

[...]

Dirk Krautschick

Posted by Andreas 'ads' Scherbaum on 2025-05-26 at 14:00

PostgreSQL Person of the Week Interview with Dirk Krautschick: I am Dirk, living near Aachen in Germany and I am actually a Senior Solution Architect with Aiven and a former DBA, Trainer, Consultant and Sales Engineer in several companies but alway related to databases - first Oracle only, later Oracle together with PostgreSQL.

Contributions for the week of 2025-05-19 (Weeks 20/21 overview)

Posted by Jimmy Angelakos in postgres-contrib.org on 2025-05-26 at 13:17

Postgres Extensions Day took place May 12 in Montreal, Canada, organized by Chris Travers, and Yurii Rashkovskii, with Andrei Lepikhov, Michel Pelletier, and Gwen Shapira additionally on the Talk Selection Committee.
- Speakers:
  - Hannu Krosing - From pl/v8 to pl/ : towards easier extension development
  - Andrey Borodin - Upgrade as an extension
  - Paul Jungwirth - Inlining Postgres Functions, Now and Then
  - Alvaro Hernandez - Postgres à la carte: dynamic container images with your choice of extensions
  - Yurii Rashkovskii - Extension Upgrades
  - Phil Eaton - Working with MemoryContexts and debugging memory leaks in Postgres
  - Sweta Vooda - Postgres as a Control Plane: Challenges in Offloading Compute via Extensions
  - Ruohang Feng - The Missing Postgres Extension Repo and Package Manager
  - David Wheeler - How to automatically release your extensions on PGXN
  - Cary Huang - Extending PostgreSQL with Java: Overcoming Development Challenges in Bridging Java and C Application
  - Cheng Chen - Rethinking OLAP Architecture: The Journey to pg_mooncake v0.2
  - Florents Tselai - Spat: Hijacking Shared Memory for a Redis-Like Experience in PostgreSQL
  - Yurii Rashkovskii - Cppgres: One less reason to hate C++
  - Alastair Turner - Extensibility: new options and a wish list
PGConf.DEV (PostgreSQL Developer Conference) took place May 13-16, in Montreal, Canada
- Organized by:
  - Kaiting Chen
  - Robert Haas
  - Jonathan Katz
  - Melanie Plageman
  - Paul Ramsey
  - Steve Singer
- Program committee:

[...]

PgPedia Week, 2025-05-25

Posted by Ian Barwick on 2025-05-26 at 11:30

PostgreSQL 18 changes this week

Sadly two features were reverted this week:

planner optimization "Partitions pruned by initial pruning are no longer locked" (commit 525392d5 , reverted in commit 1722d5eb ) function pg_get_process_memory_contexts() (commit 042a6629 , reverted in commit fb844b9f ) PostgreSQL 18 articles UUIDv7 Comes to PostgreSQL 18 (2025-05-09) - Gwen Shapira

more...

How to Build Postgres with Zig: A Drop-In GCC/Clang Alternative

Posted by Álvaro Hernández in OnGres on 2025-05-23 at 14:52

What’s Zig and Zig’s C/C++ compiler?

In case you are not familiar with it, Zig is a programming language. Among other characteristics, Zig prides itself on being a low-level / systems programming language with great interoperability with C and C++. Unlike other comparable languages like Rust, it does explicit memory allocation / freeing (even though it adds cool features like the defer keyword), so it seems to have a mental model closer to C, in which Postgres is programmed. This also makes it a very interesting language for developing Postgres extensions in Zig, and there’s the pgzx framework to help with that.

But other than the extensions, what’s Zig bringing to Postgres and what’s this post really about? Zig’s compiler. It’s quite an advanced piece of technology that, apart from compiling Zig code, can also compile C/C++ code, and does so really well. There’s a mind-blowing blog post from Andrew Kelly, creator of Zig, that I’d recommend reading, about using Zig as a C/C++ compiler, claiming it is a powerful drop-in replacement for GCC/Clang.

zig cc, the command line for Zig’s C compiler, is included with the Zig distribution, which is by itself a self-contained, small downloadable package (41-50MiB on Linux, depending on the architecture). zig cc “supports the same options as Clang, which, in turn, supports the same options as GCC”, making it a drop-in replacement. To achieve this level of compatibility, zig cc uses LLVM behind the scenes (it’s technically a layer on top of an LLVM frontend). As a curiosity, Andrew’s post details how it’s possible that Zig’s distribution is (significantly) smaller than even LLVM binaries!

So if it is a drop-in replacement, building Postgres with zig cc should be easy, right? Let’s give it a try.

Building Postgres with `zig cc`

It turns out to be quite straightforward.

First we need to download Zig. Zig is statically linked ("Zig’s Linux tarballs are fully statically linked, and therefore work correctly on all Linux distributions.").

Download a

[...]

Big Problems From Big IN lists with Ruby on Rails and PostgreSQL

Posted by Andrew Atkinson on 2025-05-23 at 14:30

Introduction

If you’ve created web apps with relational databases and ORMs like Active Record (part of Ruby on Rails), you’ve probably experienced database performance problems after a certain size of data and query volume.

In this post, we’re going to look at a specific type of problematic query pattern that’s somewhat common.

We’ll refer to this pattern as “Big INs,” which are queries with an IN clause that has a big list of values. As data grows, the length of the list of values will grow. These queries tend to perform poorly for big lists, causing user experience problems or even partial outages.

We’ll dig into the origins of this pattern, why the performance of it is poor, and explore some alternatives that you can use in your projects.

IN clauses with a big list of values

The technical term for values are a parenthesized list of scalar expressions.

For example in the SQL query below, the IN clause portion is WHERE author_id IN (1,2,3) and the list of scalar expressions is (1,2,3).

SELECT * FROM books
WHERE author_id IN (1, 2, 3);

The purpose of this clause is to perform filtering. Looking at a query execution plan in Postgres, we’ll see something like this fragment below:

Filter: (author_id = ANY ('{1,2,3}'::integer[]))

This of course filters the full set of books down to ones that match on author_id.

Filtering is a typical database operation. Why are these slow?

Parsing, planning, and executing

Remember that our queries are parsed, planned, and executed. A big list of values are treated like constants, and don’t have associated statistics.

Queries with big lists of values take more time to parse and use more memory.

Without pre-collected table statistics for planning decisions, PostgreSQL is more likely to mis-estimate cardinality and row selectivity.

This can mean the planner chooses a sequential scan over an index scan, causing a big slowdown.

How do we create this pattern?

Creating this pattern directly

In Active Record

[...]

Collation Torture Test versus Debian

Posted by Jeremy Schneider on 2025-05-23 at 05:37

Collation torture test results are finally finished and uploaded for Debian.

https://github.com/ardentperf/glibc-unicode-sorting

The test did not pick up any changes in en_US sort order for either Bullseye or Bookworm

Buster has glibc 2.28 so it shows lots of changes – as expected.

The postgres wiki had claimed that Jessie(8) to Stretch(9) upgrades were safe. This is false if the database contains non-english characters from many scripts (even with en_US locale). I just now tweaked the wording on that wiki page. I don’t think this is new info; I think it’s the same change that showed up in the Ubuntu tables under glibc 2.21 (Ubuntu 15.04)

FYI – the changelist for Stretch(9) does contain some pure ascii words like “3b3” but when you drill down to the diff, you see that it’s only moving a few lines relative to other strings with non-english characters:

@@ -13768521,42 +13768215,40 @@ $$.33
༬B༬
3B༬
3B-༬
-3b3
3B༣
3B-༣
+3B٣
+3B-٣
+3b3
3B3

In the process of adding Debian support to the scripts, I also fixed a few bugs. I’d been running the scripts from a Mac but now I’m running them from a Ubuntu laptop and there were a few minor syntax things that needed updating for running on Linux – even though, ironically, when I first started building these scripts it was on another Linux before I switched to Mac. I also added a file size sanity check, to catch if the sorted string-list file was only partly downloaded from the remote machine running some old OS (… realizing this MAY have wasted about an hour of my evening yesterday …)

The code that sorts the file on the remote instance is pretty efficient. It does the sort in two stages and the first stage is heavily parallelized to utilize whatever CPU is available. Lately I’ve mostly used c6i.4xlarge instances and I typically only need to run them for 15-20 minutes to get the data and then I terminate them. The diffs and table generation run locally. On my poor old laptop, the diff for buster ran at 100% cpu and 10°C hotter than the idle co

[...]

Community Summit at PGConf.dev

Posted by Henrietta Dombrovskaya on 2025-05-23 at 02:58

Last week, I presented at the PGConf.dev for the first time and participated in a community summit for the first time. The idea was pitched by Teresa Giacomini, and that’s how this event was described in the program:

Community building, support, and maintenance are all critical to the future of Postgres. There are many types and layers to community building from events, podcasts, & meetups to extracurricular fun like chess & karaoke; recognition & rewards to Postgres booths at non-Postgres conferences; getting started in smaller communities to wrangling a global one.

In this 3-hour summit we will:

Have short presentations from the hosts on different aspects of community
Perform a short exercise to gather the group’s thoughts on some key questions:
- What does community mean?
- How do we make it easier for people to get involved?
- What community initiatives already exist? What’s missing? How can we improve them?
Break into smaller groups to tackle areas the group believes are most important
Report out the larger group by each small group
Each group adds their results to the PostgreSQL Wiki
Determine a way for us to track our progress moving forward

Pre-work: We will gather some interest prior to the summit on topics for discussion.

Due to the interactive nature of the summit participation is limited to 60 people. Participants should be committed to build, support, or maintain a community in some way, and be ready to leave the summit with concrete action items to move the Postgres community forward. While the hosts from this summit are from the US & Europe, we hope that folks from less established Postgres communities will join us.

Pat Wright and Andreas Scherbaum were the other two organizers. We started by asking the conference organizers to email the conference participants a questionnaire with a list of topics they would be interested in discussing. Then, we analyzed

[...]

Postgres Extensions Day 2025 Kicks Off with a Successful Debut in Montréal

Posted by cary huang in Hornetlabs Technology on 2025-05-22 at 18:48

Introduction

PostgreSQL Extension Day 2025 made its successful debut on May 12, 2025, just one day before the start of pgconf.dev 2025. This focused one-day event brought together the community around a single theme: the PostgreSQL extension ecosystem. From innovative ideas and development insights to discussions on safer extension building and delivery, the day was all about “everything extensions.”

The conference featured 14 tightly packed 25-minute talks, making for a long but highly productive day. For those unable to attend in person, the event was also live-streamed on YouTube.

Thanks to the hard work of the organizers and volunteers, PostgreSQL Extension Day 2025 turned out to be a great success. In this blog, I’ll walk through some of the key highlights and takeaways from this event.

Conference Highlights

Community and Volunteer Driven

Since this was the first-ever pgext.day conference, organized by Yurii Rashkovski, there was plenty of room for things to go sideways. Fortunately, a small but dedicated team—including Grant Zhou, Sweta Vooda, Charis Charalampidi, and myself—volunteered to support Yurii with setting up the live streaming and recording equipment early in the morning. Together, we handled the camera setup, microphones, projector, and streaming rig, and quickly got up to speed on how to operate the entire system before the event began.

I have to say, by the time the conference started, I felt surprisingly confident running the live streaming, camera work, and digital recording gear—a fun learning experience in itself!

Social

The social aspect of a conference is just as important as the sessions themselves—it’s where connections are made, ideas are exchanged, and the community truly comes alive. At pgext.day 2025, we had the chance to enjoy dinner together both before and after the conference, giving everyone time to relax, share

[...]

Adventures in Extension Packaging

Posted by David Wheeler in Tembo on 2025-05-22 at 17:31

I gave a presentation at PGConf.dev last week, Adventures in Extension Packaging. It summarizes stuff I learned in the past year in developing the PGXN Meta v2 RFC, re-packaging all of the extensions on pgt.dev, and experimenting with the CloudNativePG community’s proposal to mount extension OCI images in immutable PostgreSQL containers.

Turns out a ton of work and experimentation remains to be done.

I’ll post the link to the video once it goes up, but in meantime, here are the slides:

Previous work covers the first half of the talk, including:

A brief introduction to PGXN, borrowing from the State of the Extensions Ecosystem
The metadata designed to enable automated packaging of extensions added to the PGXN Meta v2 RFC
The Trunk Packaging Format, a.k.a., PGXN RFC 2
OCI distribution of Trunk packages

The rest of the talk encompasses newer work. Read on for details.

Automated Packaging Challenges

Back in December I took over maintenance of the Trunk registry, a.k.a., pgt.dev, refactoring and upgrading all 200+ extensions and adding Postgres 17 builds. This experience opened my eyes to the wide variety of extension build patterns and configurations, even when supporting a single OS (Ubuntu 22.04 “Jammy”). Some examples:

pglogical requires an extra make param to build on PostgreSQL 17: make -C LDFLAGS_EX="-L/usr/lib/postgresql/17/lib"
Some pgrx extensions require additional params, for example:
- pg_search needs the --features flag to enable icu
- vectorscale requires the environment variable RUSTFLAGS="-C target-feature=+avx2,+fma"
pljava needs a pointer to libjvm: mvn clean install -Dpljava.libjvmdefault=/usr/lib/x86_64-linux-gnu/libjvm.so
plrust needs files to be moved arou

[...]

Best Practices for TimescaleDB Massive Delete Operations

Posted by semab tariq in Stormatics on 2025-05-22 at 13:30

Welcome to the second part of our TimescaleDB best practices series! In the first part, we explored how to perform massive backfill operations efficiently, sharing techniques to optimize performance and avoid common pitfalls. If you haven’t had a chance to read the first part yet, you can check it out using this link

In today’s blog, we will discuss another crucial aspect of time-series data management: massive delete operations.

As your data grows over time, older records often lose their relevance but continue to occupy valuable disk space, potentially increasing storage costs and might degrade the performance if not managed well.

Let’s walk through some strategies to clean up or downsample aged data in TimescaleDB, helping you maintain a lean, efficient, and cost-effective database.

Prerequisites for Massive Delete Operations

Here are a few important steps to follow for performing a large-scale delete on production and to ensure we are prepared in case something goes wrong.

Tune Autovacuum Settings

In PostgreSQL, VACUUM is a maintenance process that removes dead tuples, obsolete row versions left behind by UPDATE or DELETE operations. These dead tuples occupy space but are no longer visible to any active transactions. Vacuuming reclaims this space, helping to reduce table bloat and maintain database performance.

The autovacuum feature automates this process by periodically running in the background, ensuring that dead tuples are cleaned up without manual intervention. This is especially important after large delete operations, where a significant number of dead rows can accumulate. If not handled promptly, this can lead to table bloat, increased I/O, and slower query performance.

However, its effectiveness depends heavily on how well it is configured. Without proper tuning, autovacuum may run too infrequently or too slowly, allowing dead tuples to pile up and impact performance.

Here is a list of important autovacuum parameters along with their recommended values th

[...]

pg_dump speed across versions

Posted by Hubert 'depesz' Lubaczewski on 2025-05-22 at 07:44

Got interested recently in speed of pg_dump. Specifically, if, over the years, it has became faster, and if yes, how much. Couple of years ago I was in position where we needed to run pg_dump, and found some inefficiencies, which got later patched. This was around the version 12. So, how does the situation look … Continue reading

pgconf.dev 2025 Wraps Up with Great Success in Montreal

Posted by cary huang in Hornetlabs Technology on 2025-05-21 at 23:31

Introduction

pgconf.dev 2025 just wrapped up in Montreal, Canada, following its successful debut in Vancouver last year—and once again, it delivered a fantastic mix of deep technical content and strong community social activities.

As always, the focus was on both the current state and future direction of PostgreSQL, with over 40 thoughtfully curated technical talks covering everything from performance, storage, extensions and new features. The week wasn’t just about technical talks though—there were plenty of chances to connect through community events like Meet & Eat, the Social Run, and group dinners, making the experience as social as it was informative.

Montreal brought its own unique charm to the event. With its French-speaking culture, beautiful Old Town, and scenic waterfront, the city felt a little like Europe—laid-back, stylish, and totally different from the west coast vibe of Vancouver. Oh, and the food? Absolutely amazing!

WARNING: long blog post

Conference Highlights

Here are some personal highlights from pgconf.dev 2025, based on my own experience and participation throughout the week. I’ve made an effort to capture key takeaways from the talks I attended. and included photos from the conference to give you a feel for the energy, community, and atmosphere of the event.

Sponsor Swags

At the conference sign-in desk, a colorful array of sponsor swag was neatly displayed alongside the official pgconf.dev T-shirts. From stickers and pens to notebooks, socks, and other branded goodies, the table was a treasure trove for attendees. Everyone was welcome to help themselves and take as many items as they needed — a small but thoughtful way for sponsors to share their appreciation and for participants to bring home a piece of the event. The generous assortment added a lively and welcoming touch to the registration area, setting a positive tone from the moment attendees walked in.

Have you

[...]

Xata Agent v0.3.1: Custom tools via MCP, Ollama integration, support for reasoning models & more

Posted by Gülçin Yıldırım Jelínek in Xata on 2025-05-21 at 12:30

Version 0.3.1 of the open-source Xata Agent adds support for custom MCP servers and tools, introduces Ollama as a local LLM provider, and includes support for reasoning models O1 and O4-mini.

pdot 1.0.0: Exploring Databases Visually, Part III

Posted by Dian Fay on 2025-05-21 at 00:00

In what I can't say isn't a tradition at this point, we're in an odd-numbered year so there's news on the pdot front! Get it here!

The biggest change (and the reason for the big 1-0-0) is simplifying usage: rather than requiring a shell function to plug the graph body into a template for interactive use, pdot now outputs the entire digraph or flowchart markup. The old behavior is still available with the --body flag, but the new default means it's a lot easier to get started -- pdot postgres_air fks | dot -Tpng | wezterm imgcat and go. You only need scripting to do the pipelining for you, or to customize the graph's appearance.

Other notable updates along the way:

PGHOST, PGDATABASE, PGUSER, and PGPASSWORD environment variables are honored
new policies graph, and many improvements to others especially triggers and function refs
usable as a Rust library!

Late last year I also presented at PGConf.EU in Athens, should you be interested.

Data archiving and retention in PostgreSQL. Best practices for large datasets

Posted by DARIA NIKOLAENKO in Data Egret on 2025-05-20 at 17:08

Just over a week ago, I attended PGConf.DE 2025 in Berlin with the rest of the Data Egret team and gave a talk titled “Data Archiving and Retention in PostgreSQL: Best Practices for Large Datasets.” This post is a written version of my talk for those who couldn’t attend.

Below, you’ll find each slide from the talk — along with what was said.

I’ve started talking about something that happens with almost every Postgres database — the slow, steady growth of data. Whether it’s logs, events, or transactions — old rows pile up, performance suffers, and managing it all becomes tricky. My talk was focusing on practical ways to archive, retain, and clean up data in PostgreSQL, without breaking queries or causing downtime.

As you can see my work with Postgres focuses a lot on monitoring, performance and automation. I do that at Data Egret, where we help teams run Postgres reliably, both on-prem and in the cloud.

We specialise entirely in Postgres and involved a lot in the community. We help companies with scaling, migrations, audits, and performance tuning — everything around making Postgres run better.

I was also excited to share that Data Egret is now а part of a new initiative in the Postgres ecosystem: The Open Alliance for PostgreSQL Education. It’s an effort to build open, independent, community-driven certification.

Then I dived into the topic of my talk.

Postgres can handle big tables, but once data starts piling up, it doesn’t always degrade gracefully:

queries slow down,
VACUUM takes longer,
indexes grow,
backups get heavier.

>And often, you’re keeping old data around for reporting, audits, or just in case. And that’s OKAY. Because the issue isn’t really volume — it’s how we manage it.

This isn’t about discarding data — it’s about managing it wisely. Frequently used, or ‘hot’ data, should remain readily accessible and fast to query, without being archived or moved to storage.
And cold data? Move, c

[...]

Short alphanumeric pseudo random identifiers in Postgres

Posted by Andrew Atkinson on 2025-05-20 at 16:00

Introduction

In this post, we’ll cover a way to generate short, alphanumeric, pseudo random identifiers using native Postgres tactics.

These identifiers can be used for things like transactions or reservations, where users need to read and share them easily. This approach is an alternative to using long, random generated values like UUID values, which have downsides for usability and performance.

We’ll call the identifier a public_id and store it in a column with that name. Here are some example values:

SELECT public_id
FROM transactions
ORDER BY random()
LIMIT 3;

 public_id
-----------
 0359Y
 08nAS
 096WV

Natural and Surrogate Keys

In database design, we have design our schema to use natural and surrogate keys to identify rows.

For our public_id identifier, we’re going to generate it from a conventional surrogate integer primary key called id. We aren’t using natural keys here.

The public_id is intended for use outside the database, while the id integer primary key is used inside the database to be referenced by foreign key columns on other tables.

Whle public_id is short which minimizes space and speeds up access, the main reason for it is for usability.

With that said, the target for total space consumption was to be fewer bytes than a 16-byte UUID. This was achieved with an integer primary key and this additional 5 character generated value, targeting a smaller database where this provides plenty of unique values now and into the future.

Let’s get into the design details.

Design Properties

Here were the desired design properties:

A fixed size, 5 characters in length, regardless of the size of the input integer (and within the range of the integer data type)
Fewer bytes of space than a uuid data type
An obfuscated value, pseudorandom, not easily guessable. While not easily guessable, this is not meant to be “secure”
Reversibility back into the original integer
Only native Postgres capabilities, no extensions, client web app langu

[...]

Auto-Release PostgreSQL Extensions on PGXN

Posted by David Wheeler in Tembo on 2025-05-20 at 15:49

I last wrote about auto-releasing PostgreSQL extensions on PGXN back in 2020, but I thought it worthwhile, following my Postgres Extensions Day talk last week, to return again to the basics. With the goal to get as many extensions distributed on PGXN as possible, this post provides step-by-step instructions to help the author of any extension or Postgres utility to quickly and easily publish every release.

TL;DR

Create a PGXN Manager account
Add a META.json file to your project
Add a pgxn-tools powered CI/CD pipeline to publish on tag push
Fully-document your extensions

Release your extensions on PGXN

PGXN aims to become the defacto source for all open-source PostgreSQL extensions and tools, in order to help users quickly find and learn how to use extensions to meet their needs. Currently, PGXN distributes source releases for around 400 extensions (stats on the about page), a fraction of the ca. 1200 known extensions. Anyone looking for an extension might exist to solve some problem must rely on search engines to find potential solutions between PGXN, GitHub, GitLab, blogs, social media posts, and more. Without a single trusted source for extensions, and with the proliferation of AI Slop in search engine results, finding extensions aside from a few well-known solutions proves a challenge.

By publishing releases and full documentation — all fully indexed by its search index — PGXN aims to be that trusted source. Extension authors provide all the documentation, which PGXN formats for legibility and linking. See, for example, the pgvector docs.

If you want to make it easier for users to find your extensions, to read your documentation — not to mention provide sources for binary packaging systems — publish every release on PGXN.

Here’s how.

Create an Account

Step one: create a PGXN Manager account. The Emai

[...]

pgstream v0.5.0 update

Posted by Esther Minano in Xata on 2025-05-20 at 12:00

Improved user experience with new transformers, YAML configuration, CLI refactoring and table filtering.

Welcome new GSoC 2025 contributors!

Posted by Pavlo Golub in Cybertec on 2025-05-20 at 06:00

I’m pleased to welcome seven new Google Summer of Code 2025 contributors to the Postgres community!

I encourage you to welcome contributors during these first weeks to get them excited and invested in our community. You will meet them on mailing lists, Slack, Discord, and other media.

The table below details information about this year’s project, contributors, and mentors!

Project Title	Contributor	Assigned Mentors
Enhancements to pgwatch v3 RPC integration	Ahmad Gouda	Akshat Jaimini, Pavlo Golub
pgmoneta: Incremental backup for PostgreSQL 13-16	Ashutosh Sh	Haoran Zhang, Jesper Pedersen
Extension Support for pgexporter	Bassam Adnan	Saurav Pal, Jesper Pedersen
Upgrade pgwatch Grafana dashboards to v11	Gaurav Patidar	Rajiv Harlalka, Pavlo Golub
ABI Compliance Checker	Mankirat Singh	David Wheeler, Pavlo Golub
pgmoneta: WAL Filtering	Mohab Yasser	Shahryar Soltanpour, Jesper Pedersen
Enhancing Pgagroal Security	Tejas Tyagi	Luca Ferrari, Jesper Pedersen

We expect GSoC contributors to actively participate in the Community Bonding period from May 8th to June 1st. This period’s goal is to prepare contributors to begin their project work effectively on June 2nd. So please help them accommodate.

It was an insane start to the year! The GSoC program had the highest number of proposals ever, as well as the highest number of spam and AI-generated applications. Due to the high volume of new organizat

[...]

PgPedia Week, 2025-05-18

Posted by Ian Barwick on 2025-05-19 at 12:43

A very short edition this week...

PostgreSQL 18 changes this week

Following last week's beta1 release , things seem to have been quite quiet on all fronts, which hopefully means people are busy testing and not finding issues. From previous experience, this is the point in the release cycle where I start to review the changes over the past year and work out what I've missed ( feedback always welcome!).

PostgreSQL 18 articles Good time to test io_method (for Postgres 18) (2025-05-12) - Tomas Vondra discusses io_method and io_workers

more...

Hacking Workshop for June 2025

Posted by Robert Haas in EDB on 2025-05-19 at 12:00

Next month, I'll be hosting 2 or 3 discussions of Masahiko Sawada's talk, PostgreSQL meets ART - Using Adaptive Radix Tree to speed up vacuuming, from 2024.pgconf.dev. If you're interested in joining us, please sign up using this form and I will send you an invite to one of the sessions.

Latest Blog Posts

Posted by semab tariq in Stormatics on 2025-05-30 at 10:49

What is a Split-Brain Scenario?

Common Scenarios That Lead to Split-Brain

Posted by Alexander Korotkov on 2025-05-30 at 00:00

1. Why OrioleDB needs a “bridge”​

2. How the bridge works under the hood​

Posted by David Wheeler in Tembo on 2025-05-29 at 22:09

Posted by Andrew Atkinson on 2025-05-29 at 17:29

Introduction

Count the queries

Why reduce the number of queries?

What about app server memory?

Posted by Brandur Leach in Crunchy Data on 2025-05-29 at 13:00

Posted by Tomas Vondra on 2025-05-29 at 12:00

Posted by Karen Jex in Crunchy Data on 2025-05-28 at 17:05

Posted by Andrew Farries in Xata on 2025-05-27 at 09:00

Posted by Henrietta Dombrovskaya on 2025-05-27 at 01:38

Posted by Andreas 'ads' Scherbaum on 2025-05-26 at 14:00

Posted by Jimmy Angelakos in postgres-contrib.org on 2025-05-26 at 13:17

Posted by Ian Barwick on 2025-05-26 at 11:30

Posted by Álvaro Hernández in OnGres on 2025-05-23 at 14:52

What’s Zig and Zig’s C/C++ compiler?

Building Postgres with zig cc

Posted by Andrew Atkinson on 2025-05-23 at 14:30

Introduction

IN clauses with a big list of values

Parsing, planning, and executing

Creating this pattern directly

Posted by Jeremy Schneider on 2025-05-23 at 05:37

Posted by Henrietta Dombrovskaya on 2025-05-23 at 02:58

Posted by cary huang in Hornetlabs Technology on 2025-05-22 at 18:48

Introduction

Conference Highlights

Community and Volunteer Driven

Social

Posted by David Wheeler in Tembo on 2025-05-22 at 17:31

Automated Packaging Challenges

Posted by semab tariq in Stormatics on 2025-05-22 at 13:30

Prerequisites for Massive Delete Operations

Tune Autovacuum Settings

Posted by Hubert 'depesz' Lubaczewski on 2025-05-22 at 07:44

Posted by cary huang in Hornetlabs Technology on 2025-05-21 at 23:31

Introduction

Conference Highlights

Sponsor Swags

Posted by Gülçin Yıldırım Jelínek in Xata on 2025-05-21 at 12:30

Posted by Dian Fay on 2025-05-21 at 00:00

Posted by DARIA NIKOLAENKO in Data Egret on 2025-05-20 at 17:08

Posted by Andrew Atkinson on 2025-05-20 at 16:00

Introduction

Natural and Surrogate Keys

Design Properties

Posted by David Wheeler in Tembo on 2025-05-20 at 15:49

TL;DR

Release your extensions on PGXN

Create an Account

Posted by Esther Minano in Xata on 2025-05-20 at 12:00

Posted by Pavlo Golub in Cybertec on 2025-05-20 at 06:00

Posted by Ian Barwick on 2025-05-19 at 12:43

Posted by Robert Haas in EDB on 2025-05-19 at 12:00

Top posters

Top teams

Feeds

Planet

Contact

1. Why OrioleDB needs a “bridge”

2. How the bridge works under the hood

Building Postgres with `zig cc`