Re: snapshot too old issues, first around wraparound and then more.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: snapshot too old issues, first around wraparound and then more.
Date: 2020-04-02 00:54:06
Message-ID: 20200402005406.xtn6wgqu7veqie4p@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-04-01 16:59:51 -0700, Andres Freund wrote:
> The primary issue here is that there is no TestForOldSnapshot() in
> heap_hot_search_buffer(). Therefore index fetches will never even try to
> detect that tuples it needs actually have already been pruned away.

bitmap heap scan doesn't have the necessary checks either. In the
non-lossy case it uses heap_hot_search_buffer, for the lossy case it has
an open coded access without the check (that's bitgetpage() before v12,
and heapam_scan_bitmap_next_block() after that).

Nor do sample scans, but that was "at least" introduced later.

As far as I can tell there's not sufficient in-tree explanation of when
code needs to test for an old snapshot. There's just the following
comment above TestForOldSnapshot():
* Check whether the given snapshot is too old to have safely read the given
* page from the given table. If so, throw a "snapshot too old" error.
*
* This test generally needs to be performed after every BufferGetPage() call
* that is executed as part of a scan. It is not needed for calls made for
* modifying the page (for example, to position to the right place to insert a
* new index tuple or for vacuuming). It may also be omitted where calls to
* lower-level functions will have already performed the test.

But I don't find "as part of a scan" very informative. I mean, it
was explicitly not called from with the executor back then (for a while
the check was embedded in BufferGetPage()):

static void
bitgetpage(HeapScanDesc scan, TBMIterateResult *tbmres)
...
Page dp = BufferGetPage(buffer, NULL, NULL, BGP_NO_SNAPSHOT_TEST);

I am more than a bit dumbfounded here.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2020-04-02 01:04:15 Re: snapshot too old issues, first around wraparound and then more.
Previous Message Tomas Vondra 2020-04-02 00:41:36 Re: SLRU statistics