Re: snapbuild woes

From: Andres Freund <andres(at)anarazel(dot)de>
To: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Erik Rijkers <er(at)xs4all(dot)nl>, Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Subject: Re: snapbuild woes
Date: 2017-05-11 22:12:17
Message-ID: 20170511221217.dqok4ixviv6hlp3n@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-04-15 05:18:49 +0200, Petr Jelinek wrote:
> >>> From 3318a929e691870f3c1ca665bec3bfa8ea2af2a8 Mon Sep 17 00:00:00 2001
> >>> From: Petr Jelinek <pjmodos(at)pjmodos(dot)net>
> >>> Date: Sun, 26 Feb 2017 01:07:33 +0100
> >>> Subject: [PATCH 3/5] Prevent snapshot builder xmin from going backwards
> >>
> >> A bit more commentary would be good. What does that protect us against?
> >>
> >
> > I think I explained that in the email. We might export snapshot with
> > xmin smaller than global xmin otherwise.
> >
>
> Updated commit message with explanation as well.

> From ae60b52ae0ca96bc14169cd507f101fbb5dfdf52 Mon Sep 17 00:00:00 2001
> From: Petr Jelinek <pjmodos(at)pjmodos(dot)net>
> Date: Sun, 26 Feb 2017 01:07:33 +0100
> Subject: [PATCH 3/5] Prevent snapshot builder xmin from going backwards
>
> Logical decoding snapshot builder may encounter xl_running_xacts with
> older xmin than the xmin of the builder. This can happen because
> LogStandbySnapshot() sometimes sees already committed transactions as
> running (there is difference between "running" in terms for WAL and in
> terms of ProcArray). When this happens we must make sure that the xmin
> of snapshot builder won't go back otherwise the resulting snapshot would
> show some transaction as running even though they have already
> committed.
> ---
> src/backend/replication/logical/snapbuild.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
> index ada618d..3e34f75 100644
> --- a/src/backend/replication/logical/snapbuild.c
> +++ b/src/backend/replication/logical/snapbuild.c
> @@ -1165,7 +1165,8 @@ SnapBuildProcessRunningXacts(SnapBuild *builder, XLogRecPtr lsn, xl_running_xact
> * looking, it's correct and actually more efficient this way since we hit
> * fast paths in tqual.c.
> */
> - builder->xmin = running->oldestRunningXid;
> + if (TransactionIdFollowsOrEquals(running->oldestRunningXid, builder->xmin))
> + builder->xmin = running->oldestRunningXid;
>
> /* Remove transactions we don't need to keep track off anymore */
> SnapBuildPurgeCommittedTxn(builder);
> --
> 2.7.4

I still don't understand. The snapshot's xmin is solely managed via
xl_running_xacts, so I don't see how the WAL/procarray difference can
play a role here. ->committed isn't pruned before xl_running_xacts
indicates it can be done, so I don't understand your explanation above.

I'd be ok with adding this as paranoia check, but I still don't
understand when it could practically be hit.

- Andres

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-05-11 22:13:30 Re: WITH clause in CREATE STATISTICS
Previous Message Andres Freund 2017-05-11 21:54:26 Re: snapbuild woes