Re: WIP: long transactions on hot standby feedback replica / proof of concept

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Ivan Kartyshov <i(dot)kartyshov(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: long transactions on hot standby feedback replica / proof of concept
Date: 2017-11-03 22:04:22
Message-ID: CAPpHfdu9cDv_2Yw87=5U25P+1k8Mv=K_o78tTcT70km55okK5g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 1, 2017 at 5:55 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
wrote:

> On Tue, Oct 31, 2017 at 6:17 PM, Alexander Korotkov
> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > On Tue, Oct 31, 2017 at 5:16 AM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
> > wrote:
> >>
> >> On Mon, Oct 30, 2017 at 10:16 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
> >> wrote:
> >> > On Tue, Oct 24, 2017 at 1:26 PM, Ivan Kartyshov
> >> > <i(dot)kartyshov(at)postgrespro(dot)ru> wrote:
> >> >> Hello. I made some bugfixes and rewrite the patch.
> >> >
> >> > I don't think it's a good idea to deliberately leave the state of the
> >> > standby different from the state of the master on the theory that it
> >> > won't matter. I feel like that's something that's likely to come back
> >> > to bite us.
> >>
> >> I agree with Robert. What happen if we intentionally don't apply the
> >> truncation WAL and switched over? If we insert a tuple on the new
> >> master server to a block that has been truncated on the old master,
> >> the WAL apply on the new standby will fail? I guess there are such
> >> corner cases causing failures of WAL replay after switch-over.
> >
> >
> > Yes, that looks dangerous. One approach to cope that could be teaching
> heap
> > redo function to handle such these situations. But I expect this
> approach
> > to be criticized for bad design. And I understand fairness of this
> > criticism.
> >
> > However, from user prospective of view, current behavior of
> > hot_standby_feedback is just broken, because it both increases bloat and
> > doesn't guarantee that read-only query on standby wouldn't be cancelled
> > because of vacuum. Therefore, we should be looking for solution: if one
> > approach isn't good enough, then we should look for another approach.
> >
> > I can propose following alternative approach: teach read-only queries on
> hot
> > standby to tolerate concurrent relation truncation. Therefore, when
> > non-existent heap page is accessed on hot standby, we can know that it
> was
> > deleted by concurrent truncation and should be assumed to be empty. Any
> > thoughts?
> >
>
> You also meant that the applying WAL for AccessExclusiveLock is always
> skipped on standby servers to allow scans to access the relation?

Definitely not every AccessExclusiveLock WAL records should be skipped, but
only whose were emitted during heap truncation. There are other cases when
AccessExclusiveLock WAL records are emitted, for instance, during DDL
operations. But, I'd like to focus on AccessExclusiveLock WAL records
caused by VACUUM for now. It's kind of understandable for users that DDL
might cancel read-only query on standby. So, if you're running long report
query then you should wait with your DDL. But VACUUM is a different
story. It runs automatically when you do normal DML queries.

AccessExclusiveLock WAL records by VACUUM could be either not emitted, or
somehow distinguished and skipped on standby. I haven't thought out that
level of detail for now.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2017-11-03 22:12:43 Re: How to implement a SP-GiST index as a extension module?
Previous Message Alexander Korotkov 2017-11-03 21:57:15 Re: WIP: long transactions on hot standby feedback replica / proof of concept