XID-wraparound hazards in LISTEN/NOTIFY

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Martijn van Oosterhout <kleptog(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Mark Dilger <hornschnorter(at)gmail(dot)com>
Subject: XID-wraparound hazards in LISTEN/NOTIFY
Date: 2019-11-23 16:34:45
Message-ID: 27360.1574526885@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

In connection with a different issue, I wrote:

> * The point of calling asyncQueueReadAllNotifications in
> Exec_ListenPreCommit is to advance over already-committed queue entries
> before we start sending any events to the client. Without this, a
> newly-listening client could be sent some very stale events. (Note
> that 51004c717 changed this from "somewhat stale" to "very stale".

It suddenly strikes me to worry that we have an XID wraparound hazard
for entries in the notify queue. The odds of seeing a live problem
with that before 51004c717 were pretty minimal, but now that we don't
aggressively advance the queue tail, I think it's a very real risk for
low-notify-traffic installations. In the worst case, imagine

* Somebody sends one NOTIFY, maybe just as a test.

* Nothing happens for a couple of weeks, during which the XID counter
advances by 2 billion or so.

* Newly-listening sessions will now think that that old event is
"in the future", hence fail to advance over it, resulting in denial
of service for new notify traffic. There is no recourse short of
a server restart or waiting another couple weeks for wraparound.

I thought about fixing this by tracking the queue's oldest XID in
the shared queue info, and forcing a tail advance when that got
too old --- but if nobody actively uses any listen or notify
features for awhile, no such code is going to execute, so the
above scenario could happen anyway.

The only bulletproof fix I can think of offhand is to widen the
queue entries to 64-bit XIDs, which is a tad annoying from a
space consumption standpoint. Possibly we could compromise by
storing the high-order bits just once per SLRU page (and then
forcing an advance to a new page when those bits change).

A somewhat less bulletproof fix is to detect far-in-the-future queue
entries and discard them. That would prevent the freezeup scenario,
but there'd be a residual hazard of transmitting ancient
notifications to clients (if a queue entry survived for 4G
transactions not just 2G). But maybe that's OK, given the rather
tiny probabilities involved, and the low guarantees around notify
reliability in general. It'd be a much more back-patchable answer
than a queue format change, too.

Thoughts? I'm not really planning to work on this myself, but
somebody oughta.

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2019-11-23 16:35:09 Re: [HACKERS] WAL logging problem in 9.4.3?
Previous Message Alvaro Herrera 2019-11-23 16:14:06 Re: errbacktrace