Re: Toast issues with OldestXmin going backwards

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Toast issues with OldestXmin going backwards
Date: 2018-04-22 12:00:11
Message-ID: CAA4eK1+qpcPy+xVfYtf8GNR6r1VuLc+=q0GVCV3cecwkMqVobQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 21, 2018 at 1:26 AM, Andrew Gierth
<andrew(at)tao11(dot)riddles(dot)org(dot)uk> wrote:
>>>>>> "Amit" == Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> writes:
>
> Amit> I haven't tried to reproduce it, but I can see the possibility of
> Amit> the problem described by you. What should we do next? I could see
> Amit> few possibilities: (a) Vacuum for main and toast table should
> Amit> always use same OldestXmin,
>
> I don't see how this could be arranged without either disabling the
> ability to vacuum the toast table independently, or storing the xmin
> somewhere.
>

I think we can use the same xmin for both main heap and toast by
computing it before scanning the main heap (lazy_vacuum_rel) and then
pass it down. I have tried it and came up with the attached patch.

> Amit> (b) Before removing the row from toast table, we should ensure
> Amit> that row in the main table is removed,
>
> We can't find the main table row version(s) without scanning the whole
> main table, so this amounts (again) to disabling vacuum on the toast
> table only.
>
> Amit> (c) Change the logic during rewrite such that it can detect this
> Amit> situation and skip copying the tuple in the main table,
>
> This seems more promising, but the problem is how to detect the error
> safely (since currently, it's just ereport()ed from deep inside the
> toast code). How about:
>
> 1) add a toast_datum_exists() call or similar that returns false if the
> (first block of) the specified external toast item is missing
>
> 2) when copying a RECENTLY_DEAD tuple, check all its external column
> values using this function beforehand, and if any are missing, treat the
> tuple as DEAD instead.
>
> Amit> on a quick look, this seems tricky, because the toast table TID
> Amit> might have been reused by that time,
>
> Toast pointers don't point to TIDs fortunately, they point to OIDs. The
> OID could have been reused (if there's been an OID wraparound since the
> toast item was created), but that should be harmless: it means that we'd
> incorrectly copy the new value with the old tuple, but the old tuple is
> never going to be visible to anybody ever again so nothing will see that.
>

Yeah, that's right, but it gives some uneasy feeling that we are
attaching the wrong toast value. If you think there is some basic
flaw in Approach-1 (as in attached patch) or it requires some major
surgery then we can look further into this.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
use_same_oldestxmin_mainheap_toast_v1.patch application/octet-stream 5.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2018-04-22 12:04:10 Re: BGWORKER_BYPASS_ALLOWCONN used nowhere (infra part of on-line checksum switcher)
Previous Message Michael Paquier 2018-04-22 11:11:00 BGWORKER_BYPASS_ALLOWCONN used nowhere (infra part of on-line checksum switcher)