From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Craig Ringer <craig(at)2ndquadrant(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Timeline following for logical slots |
Date: | 2016-04-04 10:01:16 |
Message-ID: | 20160404100116.GB25969@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2016-04-04 17:50:02 +0800, Craig Ringer wrote:
> To rephrase per my understanding: The client only specifies the point it
> wants to start seeing decoded commits. Decoding starts from the slot's
> restart_lsn, and that's the point from which the accumulation of reorder
> buffer contents begins, the snapshot building process begins, and where
> accumulation of relcache invalidation information begins. At restart_lsn no
> xact that is to be emitted to the client may yet be in progress. Decoding,
s/yet/already/
> whether or not the xacts will be fed to the output plugin callbacks,
> requires access to the system catalogs. Therefore catalog_xmin reported by
> the slot must be >= the real effective catalog_xmin of the heap and valid
> at the restart_lsn, not just the confirmed flush point or the point the
> client specifies to resume fetching changes from.
Hm. Maybe I'm misunderstanding you here, but doesn't it have to be <=?
> On the original copy of the slot on the pre-failover master the restart_lsn
> would've been further ahead, as would the catalog_xmin. So catalog rows
> have been purged.
+may
> So it's necessary to ensure that the slot's restart_lsn and catalog_xmin
> are advanced in a timely, consistent manner on the replica's copy of the
> slot at a point where no vacuum changes to the catalog that could remove
> needed tuples have been replayed.
Right.
> The only way I can think of to do that really reliably right now, without
> full failover slots, is to use the newly committed pluggable WAL mechanism
> and add a hook to SaveSlotToPath() so slot info can be captured, injected
> in WAL, and replayed on the replica.
I personally think the primary answer is to use separate slots on
different machines. Failover slots can be an extension to that at some
point, but I think they're a secondary goal.
> It'd also be necessary to move
> CheckPointReplicationSlots() out of CheckPointGuts() to the start of a
> checkpoint/restartpoint when WAL writing is still permitted, like the
> failover slots patch does.
Ugh. That makes me rather wary.
> Basically, failover slots as a plugin using a hook, without the
> additions to base backup commands and the backup label.
I'm going to be *VERY* hard to convince that adding a hook inside
checkpointing code is acceptable.
> I'd really hate 9.6 to go out with - still - no way to use logical decoding
> in a basic, bog-standard HA/failover environment. It overwhelmingly limits
> their utility and it's becoming a major drag on practical use of the
> feature. That's a difficulty given that the failover slots patch isn't
> especially trivial and you've shown that lazy sync of slot state is not
> sufficient.
I think the right way to do this is to focus on failover for logical
rep, with separate slots. The whole idea of integrating this physical
rep imo makes this a *lot* more complex than necessary. Not all that
many people are going to want to physical rep and logical rep.
> The restart_lsn from the newer copy of the slot is, as you said, a point we
> know we can reconstruct visibility info.
We can on the master. There's absolutely no guarantee that the
associated serialized snapshot is present on the standby.
Andres
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2016-04-04 10:03:33 | Re: PATCH: use foreign keys to improve join estimates v1 |
Previous Message | Craig Ringer | 2016-04-04 09:50:02 | Re: Timeline following for logical slots |