| From: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com> |
|---|---|
| To: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
| Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: 64-bit wait_event and introduction of 32-bit wait_event_arg |
| Date: | 2026-02-12 12:42:23 |
| Message-ID: | CAKZiRmxw1KwEPJZk8equXFyFweSt_X9hH59RdSAzpNROGEKG=w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Jan 14, 2026 at 9:56 AM Jakub Wartak
<jakub(dot)wartak(at)enterprisedb(dot)com> wrote:
>
> On Wed, Jan 14, 2026 at 9:38 AM Bertrand Drouvot
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > On Fri, Jan 09, 2026 at 11:34:09AM +0100, Jakub Wartak wrote:
> > > On Tue, Dec 9, 2025 at 10:11 AM Jakub Wartak
> > > <jakub(dot)wartak(at)enterprisedb(dot)com> wrote:
> > > >
> > > > Hi Heikki, thanks for having a look!
> > > >
> > > > On Mon, Dec 8, 2025 at 11:12 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> > > > >
> > > > > On 08/12/2025 11:54, Jakub Wartak wrote:
> > > > > > While thinking about cons, the only cons that I could think of is that
> > > > > > when we would be exposing something as 32-bits , then if the following
> > > > > > major release changes some internal structure/data type to be a bit
> > > > > > more heavy, it couldn't be exposed anymore like that (think of e.g.
> > > > > > 64-bit OIDs?)
> > > > > >
> > > > > > Any help, opinions, ideas and code/co-authors are more than welcome.
> > > >
> > > > > Expanding it to 64 bit seems fine as far as performance is concerned. I
> > > > > think the difficult and laborious part is to design the facilities to
> > > > > make use of it.
> > > >
> > > > Right, I'm very interested in hearing what could be added there/what
> > > > people want (bonus points if that is causing some performance issues
> > > > today and we do not have the area covered and exposing that would fit
> > > > in 32-bits ;) )
> > > >
> > >
> > > OK, so v3 is attached. Changes in v3:
> >
> > Thanks for the new version!
> >
> > It looks like that it needs a rebase. Also, FWIW, a quick scan shows a few
> > numbers of "XXX" and elog calls commented out (that are probably used during
> > your own debugging?).
>
> Yes, indeed, that's intentional right now - it's more like a draft
> rather than something that should be polished.
>
> To be honest I would like to avoid sinking more time on it, if the
> sole idea gets shot down or there is opposition due e.g. to concerns
> of exposing 32-bit relfilenodes that way (see that 56-bit relfilenode
> idea).
Goodafter gentlemen,
I was considering marking this as Rejected/RwF and giving up due
RelFilesNodes could becoming > 32-bits which kinda goes against the
the main intention of this patch (showing involved relations involved
in some complex LWLock/ Multixact performance scenarios).
In offline discussions with Andres and Robert I've learned that:
1. there's still room that RelFileNodes could become 56-bits one day
2. introducing another uint64 just for wait_events_arg is a no-go zone
due to performance concerns.
3. exposing something like "relfilenode % (2^32)" is seem as hack and could
cause issues (problems with interpretation/conflicts in future when
RelFileNode would be bigger)
Anyway, today this WIP/PoC patchset gives:
postgres=# select type, substring(name, 1, 20) wait,
substring(waiteventarg_description,1,43) as desc from pg_get_wait_events()
where waiteventarg_description != '';
type | wait | desc
---------+----------------------+---------------------------------------------
Buffer | BufferCleanup | Buffer# or UINT32_MAX for local(temporary)..
Buffer | BufferExclusive | Buffer# or UINT32_MAX for local(temporary)..
Buffer | BufferShared | Buffer# or UINT32_MAX for local(temporary)..
Buffer | BufferShareExclusive | Buffer# or UINT32_MAX for local(temporary)..
IO | SlruFlushSync | SlruType: unknown(0), notify(1), clog(2), ..
IO | SlruRead | SlruType: unknown(0), notify(1), clog(2), ..
IO | SlruSync | SlruType: unknown(0), notify(1), clog(2), ..
IO | SlruWrite | SlruType: unknown(0), notify(1), clog(2), ..
IPC | BufferIo | Buffer# or UINT32_MAX for local(temporary)
IPC | RecoveryConflictTabl | tablespace Oid causing conflict.
IPC | SyncRep | PID of the slowest walsender.
Timeout | PgSleep | how many seconds to sleep for.
Timeout | SpinDelay | Number of spinlock delays.
Summary of changes since previous version:
- Removed all refilnodeid references including
ProcSleep()->WaitLatch(..PG_WAIT_LOCK | locktag_field2 );
as we cannot take locktag_type_field2 (which maps to reloid, set by
SET_LOCKTAG_RELATION)
- In pgstat_report_wait_end() change volatile direct set to zero with
more proper: pg_atomic_write_u64(..,0);
- separated patch for SyncRepWaitForLSN() as I have plenty of performance
concerns there (with abnormally high max_wal_senders). I could reduce those
spinlocks happen not more often than every N iterations as today
there is a full scan
under spinlocks every time the latch is reset, but how often to do this
scan then?
- added exposing Buffer# (one can lookup relation via pg_buffercache),
idea by Andres, it seems to work (simulated with fetching from cursor):
pid | type | wait_event | wait_event_arg | state | query
--------+--------+--------------+----------------+--------+----------------
250556 | Buffer BufferCleanup | 225 | active | VACUUM (FREEZE)..
postgres=# select
pg_filenode_relation(0, relfilenode)::regclass,
pinning_backends
from pg_buffercache where bufferid = 225;
pg_filenode_relation | pinning_backends
----------------------+-----------------
pin_test | 2
- added exposing Timeout/SpinDelay, not sure if that would be helpful
What's left:
- Earlier Heikki raised the question "Wait events can be defined in extensions;
how does an extension plug into this facility?" - that's still unanswered.
I think they could just OR 32-bit value themselves, but maybe we could
just provide a way to plug into pg_get_wait_events().waiteventarg_description?
- docs
- of course it could be extended with some reporting if one finds further
ideas
-J.
| Attachment | Content-Type | Size |
|---|---|---|
| v4-0006-wait_event_arg-expose-buffer-for-Buffer-type-wait.patch | text/x-patch | 3.7 KB |
| v4-0002-wait_event_arg-expose-slowest-standby-PID-for-IPC.patch | text/x-patch | 2.8 KB |
| v4-0004-Expose-meaning-of-new-per-wait-wait_event_arg-thr.patch | text/x-patch | 11.3 KB |
| v4-0005-wait_event_arg-report-number-of-spinlock-delays-f.patch | text/x-patch | 2.1 KB |
| v4-0003-wait_event_arg-implement-SLRU-type-reporting-for-.patch | text/x-patch | 11.1 KB |
| v4-0001-Convert-wait_event_info-to-64-t-bits-expose-lower.patch | text/x-patch | 78.1 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Matheus Alcantara | 2026-02-12 12:43:32 | Re: Add CREATE SCHEMA ... LIKE support |
| Previous Message | Dean Rasheed | 2026-02-12 12:23:17 | Re: Allow ON CONFLICT DO UPDATE to return EXCLUDED values |