Re: Truncation failure in autovacuum results in data corruption (duplicate keys)

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: maumau307(at)gmail(dot)com, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Truncation failure in autovacuum results in data corruption (duplicate keys)
Date: 2018-08-20 15:04:31
Message-ID: CAPpHfduiVRuAg-xfNTJXpCz0WYX+RV1hd8vM=X6=VMj2-Q8=qg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 18, 2018 at 10:04 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> [ re-reads thread... ] The extra assumption you need in order to have
> trouble is that the blocks in question are dirty in shared buffers and
> have never been written to disk since their rows were deleted. Then
> the situation is that the page image on disk shows the rows as live,
> while the up-to-date page image in memory correctly shows them as dead.
> Relation truncation throws away the page image in memory without ever
> writing it to disk. Then, if the subsequent file truncate step fails,
> we have a problem, because anyone who goes looking for that page will
> fetch it afresh from disk and see the tuples as live.
>
> There are WAL entries recording the row deletions, but that doesn't
> help unless we crash and replay the WAL.
>
> It's hard to see a way around this that isn't fairly catastrophic for
> performance :-(. But in any case it's wrapped up in order-of-operations
> issues. I've long since forgotten the details, but I seem to have thought
> that there were additional order-of-operations hazards besides this one.

Just for clarification. Do you mean zeroing of to-be-truncated blocks
to be catastrophic for performance? Or something else?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-08-20 15:45:12 Re: Truncation failure in autovacuum results in data corruption (duplicate keys)
Previous Message Andres Freund 2018-08-20 15:02:09 Re: Two proposed modifications to the PostgreSQL FDW