Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Ivan Kartyshov <i(dot)kartyshov(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept
Date: 2018-08-24 15:53:14
Message-ID: CAPpHfdu03sZ21U76vPncN3w_+N9jP4p5Svj5fb6FF9N58GYWYw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 21, 2018 at 4:10 PM Alexander Korotkov
<a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> After reading [1] and [2] I got that there are at least 3 different
> issues with heap truncation:
> 1) Data corruption on file truncation error (explained in [1]).
> 2) Expensive scanning of the whole shared buffers before file truncation.
> 3) Cancel of read-only queries on standby even if hot_standby_feedback
> is on, caused by replication of AccessExclusiveLock.
>
> It seems that fixing any of those issues requires redesign of heap
> truncation. So, ideally redesign of heap truncation should fix all
> the issues of above. Or at least it should be understood how the rest
> of issues can be fixed later using the new design.
>
> I would like to share some my sketchy thoughts about new heap
> truncation design. Let's imagine we introduced dirty_barrier buffer
> flag, which prevents dirty buffer from being written (and
> correspondingly evicted). Then truncation algorithm could look like
> this:
>
> 1) Acquire ExclusiveLock on relation.
> 2) Calculate truncation point using count_nondeletable_pages(), while
> simultaneously placing dirty_barrier flag on dirty buffers and saving
> their numbers to array. Assuming no writes are performing
> concurrently, no to-be-truncated-away pages should be written from
> this point.
> 3) Truncate data files.
> 4) Iterate past truncation point buffers and clean dirty and
> dirty_barrier flags from them (using numbers we saved to array on step
> #2).
> 5) Release relation lock.
> *) On exception happen after step #2, iterate past truncation point
> buffers and clean dirty_barrier flags from them (using numbers we
> saved to array on step #2)
>
> After heap truncation using this algorithm, shared buffers may contain
> past-OEF buffers. But those buffers are empty (no used items) and
> clean. So, real-only queries shouldn't hint those buffers dirty
> because there are no used items. Normally, these buffers will be just
> evicted away from the shared buffer arena. If relation extension will
> happen short after heap truncation then some of those buffers could be
> found after relation extension. I think this situation could be
> handled. For instance, we can teach vacuum to claim page as new once
> all the tuples were gone.
>
> We're taking only exclusive lock here. And assuming we will teach our
> scans to treat page-past-OEF situation as no-visible-tuples-found,
> concurrent read-only queries will work concurrently with heap
> truncate. Also we don't have to scan whole shared buffers, only past
> truncation point buffers are scanned at step #2. Later flags are
> cleaned only from truncation point dirty buffers. Data corruption on
> truncation error also shouldn't happen as well, because we don't
> forget to write any dirty buffers before insure that data files were
> successfully truncated.
>
> The problem I see with this approach so far is placing too many
> dirty_barrier flags can affect concurrent activity. In order to cope
> that we may, for instance, truncate relation in multiple iterations
> when we find too many past truncation point dirty buffers.
>
> Any thoughts?

Given I've no feedback on this idea yet, I'll try to implement a PoC
patch for that. It doesn't look to be difficult. And we'll see how
does it work.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2018-08-24 16:01:09 pg_verify_checksums -r option
Previous Message Alexander Korotkov 2018-08-24 15:50:38 Re: Flexible configuration for full-text search