Re: Toast issues with OldestXmin going backwards

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Toast issues with OldestXmin going backwards
Date: 2018-04-20 19:56:22
Message-ID: 87604lekqo.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "Amit" == Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:

Amit> I haven't tried to reproduce it, but I can see the possibility of
Amit> the problem described by you. What should we do next? I could see
Amit> few possibilities: (a) Vacuum for main and toast table should
Amit> always use same OldestXmin,

I don't see how this could be arranged without either disabling the
ability to vacuum the toast table independently, or storing the xmin
somewhere.

Amit> (b) Before removing the row from toast table, we should ensure
Amit> that row in the main table is removed,

We can't find the main table row version(s) without scanning the whole
main table, so this amounts (again) to disabling vacuum on the toast
table only.

Amit> (c) Change the logic during rewrite such that it can detect this
Amit> situation and skip copying the tuple in the main table,

This seems more promising, but the problem is how to detect the error
safely (since currently, it's just ereport()ed from deep inside the
toast code). How about:

1) add a toast_datum_exists() call or similar that returns false if the
(first block of) the specified external toast item is missing

2) when copying a RECENTLY_DEAD tuple, check all its external column
values using this function beforehand, and if any are missing, treat the
tuple as DEAD instead.

Amit> on a quick look, this seems tricky, because the toast table TID
Amit> might have been reused by that time,

Toast pointers don't point to TIDs fortunately, they point to OIDs. The
OID could have been reused (if there's been an OID wraparound since the
toast item was created), but that should be harmless: it means that we'd
incorrectly copy the new value with the old tuple, but the old tuple is
never going to be visible to anybody ever again so nothing will see that.

Amit> or (d) Ensure that GetOldestXmin doesn't move backwards or write
Amit> a new API similar to it which doesn't allow OldestXmin to move
Amit> backwards and use that for the purpose of vacuum.

This would presumably require storing a reference OldestXmin somewhere
(in shared memory? but there'd have to be a global one and one per db,
and I don't think we have any shared memory structures at present that
are per-db). We'd even have to preserve an oldestXmin for databases with
no currently active connections, though we wouldn't have to preserve it
across restarts.

Amit> Any better ideas?

It turns out this issue has come up before (5 years ago!):

https://www.postgresql.org/message-id/20362.1359747327%40sss.pgh.pa.us

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Gierth 2018-04-20 20:03:17 Re: Add read-only param to set_config(...) / SET that effects (at least) customized runtime options
Previous Message Alexander Kuzmenkov 2018-04-20 18:51:49 Re: Reopen logfile on SIGHUP