Re: reloption to prevent VACUUM from truncating empty pages at the end of relation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "'Michael Paquier'" <michael(at)paquier(dot)xyz>, "'Robert Haas'" <robertmhaas(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, "Jamison, Kirk" <k(dot)jamison(at)jp(dot)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: reloption to prevent VACUUM from truncating empty pages at the end of relation
Date: 2019-02-28 22:17:43
Message-ID: 1261.1551392263@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> writes:
> On 2019-Feb-28, Tom Lane wrote:
>> I wasn't really working on that for v12 --- I figured it was way
>> too late in the cycle to be starting on such a significant change.

> Oh, well, it certainly seems far too late *now*. However, what about
> the idea in
> https://postgr.es/m/1255.1544562482@sss.pgh.pa.us
> namely that we write out the buffers involved? That sounds like it
> might be backpatchable, and thus it's not too late for it.

I think that what we had in mind at that point was that allowing forced
writes of empty-but-dirty pages would provide a back-patchable solution
to the problem of ftruncate() failure leaving corrupt state on-disk.
That would not, by itself, remove the need for AccessExclusiveLock, so it
doesn't seem like it would eliminate people's desire for the kind of knob
being discussed here.

Thinking about it, the need for AEL is mostly independent of the data
corruption problem; rather, it's a hack to avoid needing to think about
concurrent-truncation scenarios in table readers. We could fairly
easily reduce the lock level to something less than AEL if we just
taught seqscans, indexscans, etc that trying to read a page beyond
EOF is not an error. (Reducing the lock level to the point where
we could allow concurrent *writers* is a much harder problem, I think.
But to ameliorate the issues for standbys, we just need to allow
concurrent readers.) And we'd have to do something about readers
possibly loading doomed pages back into shmem before the truncation
happens; maybe that can be fixed just by truncating first and flushing
buffers second?

I think the $64 question is whether we're giving up any meaningful degree
of error detection if we allow read-beyond-EOF to not be an error. If we
conclude that we're not, maybe it wouldn't be a very big patch?

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-02-28 22:20:23 Re: Drop type "smgr"?
Previous Message Andres Freund 2019-02-28 22:08:49 Re: Drop type "smgr"?