From: | Andrew Johnson <andrewj(at)metronome(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | [PATCH v1] Add pg_stat_multixact view for multixact membership usage monitoring |
Date: | 2025-06-10 14:40:23 |
Message-ID: | CADkHZ=en+Ua0sbtcTdwaMGJjQF7gNFnQ7YTinWJj+jU_442ZzA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello hackers,
I'd like to propose adding a new view named "pg_stat_multixact" to
expose multixact member usage. This addresses a major monitoring gap
that ultimately led to a production outage at Metronome [1].
Problem
Multixact membership exhaustion is an edge case that can cause write
lockouts, but there's no visibility into membership space usage.
Without any direct telemetry from the database, we're essentially
flying blind. It is possible to estimate multixact membership usage
through scanning the filesystem, but there are several drawbacks to
that method that Naga Appani outlined in a previous thread [2].
This complements Peter Geoghegan's recent thread about vacuum failsafe
improvements [3], where Sami Imseih noted "exposing the members
count... will be a good idea as well" [4].
Solution
- New view (pg_stat_multixact) with the columns "members" (bigint) and
"update_timestamp" (timestamptz).
- Updates member count and timestamp during multixact allocation and
freeze threshold checks.
I've attached a patch that:
- Implements this view using pgstat patterns.
- Includes isolation tests.
- Includes documentation changes to monitoring.sgml.
I have also:
- Tested initdb works
- Ran make check-world with --enable-tap-tests to ensure all tests pass
I'm aiming to get this into the upcoming CommitFest. I would
appreciate your thoughts on this proposal and attached patch.
[1] https://metronome.com/blog/root-cause-analysis-postgresql-multixact-member-exhaustion-incidents-may-2025
[2] https://www.postgresql.org/message-id/flat/CALdSSPi3Gh08NtcCn44uVeUAYGOT74sU6uei_06qUTa5rMK43g(at)mail(dot)gmail(dot)com#bfd9ae766ef42f7599258183aa8ddb3b
[3] https://www.postgresql.org/message-id/CAH2-WzmLPWJk3gbAxy8dHY+A-Juz_6uGwfe6DkE8B5-dTDvLcw@mail.gmail.com
[4] https://www.postgresql.org/message-id/CAA5RZ0u43s4YbR%3D0mJ0_k3VGWjchJHhYnCoaZVzeLd3ccZtwhQ%40mail.gmail.com
--
Respectfully,
Andrew Johnson
Software Engineer
Metronome, Inc.
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Adding-pg_stat_muiltixact-view-to-allow-membershi.patch | application/octet-stream | 21.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2025-06-10 15:32:01 | Re: Remaining dependency on setlocale() |
Previous Message | David Geier | 2025-06-10 14:30:09 | Re: Buffer overflow in SerializeLibraryState() found by Address Sanitizer |