Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Ivan Kartyshov <i(dot)kartyshov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept
Date: 2018-08-17 21:20:53
Message-ID: 30715.1534540853@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> writes:
> On Fri, Aug 17, 2018 at 9:55 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Another point is that the truncation code attempts to remove all
>> to-be-truncated-away pages from the shared buffer arena, but that only
>> works if nobody else is loading such pages into shared buffers
>> concurrently. In the presence of concurrent scans, we might be left
>> with valid-looking buffers for pages that have been truncated away
>> on-disk. That could cause all sorts of fun later. Yeah, the buffers
>> should contain only dead tuples ... but, for example, they might not
>> be hinted dead. If somebody sets one of those hint bits and then
>> writes the buffer back out to disk, you've got real problems.

> Thank you for the explanation. I see that injecting past OEF pages
> into shared buffers doesn't look good. I start thinking about letting
> caller of ReadBuffer() (or its variation) handle past OEF situation.

That'd still have the same race condition, though: between the time
we start to drop the doomed pages from shared buffers, and the time
we actually perform ftruncate, concurrent scans could re-load such
pages into shared buffers.

Could it work to ftruncate first and flush shared buffers after?
Probably not, I think the write-back-dirty-hint-bits scenario
breaks that one.

If this were easy, we'd have fixed it years ago :-(. It'd sure
be awfully nice not to need AEL during autovacuum, even transiently;
but I'm not sure how we get there without adding an unpleasant amount
of substitute locking in table scans.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jonathan S. Katz 2018-08-17 21:21:20 Fix for REFRESH MATERIALIZED VIEW ownership error message
Previous Message Jonathan S. Katz 2018-08-17 21:12:42 Re: docs: note ownership requirement for refreshing materialized views