Re: snapshot too old issues, first around wraparound and then more.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: snapshot too old issues, first around wraparound and then more.
Date: 2020-04-02 01:46:46
Message-ID: 20200402014646.dr5brit4cubph3k5@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-04-01 17:54:06 -0700, Andres Freund wrote:
> * Check whether the given snapshot is too old to have safely read the given
> * page from the given table. If so, throw a "snapshot too old" error.
> *
> * This test generally needs to be performed after every BufferGetPage() call
> * that is executed as part of a scan. It is not needed for calls made for
> * modifying the page (for example, to position to the right place to insert a
> * new index tuple or for vacuuming). It may also be omitted where calls to
> * lower-level functions will have already performed the test.

To me this sounds like we'd not need to check for an old snapshot in
heap_delete/update/lock_tuple. And they were explictly not testing for
old snapshots. But I don't understand why that'd be correct?

In a lot of UPDATE/DELETE queries there's no danger that the target
tuple will be pruned away, because the underlying scan node will hold a
pin. But I don't think that's guaranteed. E.g. if a tidscan is below the
ModifyTable node, it will not hold a pin by the time we heap_update,
because there's no scan holding a pin, and the slot will have been
materialized before updating.

There are number of other ways, I think.

So it's possible to get to heap_update/delete (and probably lock_tuple
as well) with a tid that's already been pruned away. Neither contains a
non-assert check ensuring the tid still is normal.

With assertions we'd fail with an assertion in PageGetItem(). But
without it looks like we'll interpret the page header as a tuple. Which
can't be good.

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Shinoda, Noriyoshi (PN Japan A&PS Delivery) 2020-04-02 02:04:10 RE: SLRU statistics
Previous Message movead.li@highgo.ca 2020-04-02 01:41:48 Re: recovery_target_action=pause with confusing hint