Snapbuild woes followup

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andres Freund <andres(at)anarazel(dot)de>, Petr Jelinek <petr(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Snapbuild woes followup
Date: 2021-01-25 17:28:33
Message-ID: c94be044-818f-15e3-1ad3-7a7ae2dfed0a@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

In SnapBuildFinalSnapshot(), we have this comment:
> /*
> * c) transition from START to BUILDING_SNAPSHOT.
> *
> * In START state, and a xl_running_xacts record with running xacts is
> * encountered. In that case, switch to BUILDING_SNAPSHOT state, and
> * record xl_running_xacts->nextXid. Once all running xacts have finished
> * (i.e. they're all >= nextXid), we have a complete catalog snapshot. It
> * might look that we could use xl_running_xact's ->xids information to
> * get there quicker, but that is problematic because transactions marked
> * as running, might already have inserted their commit record - it's
> * infeasible to change that with locking.
> */

This was added in commit 955a684e040, before that we did in fact use the
xl_running_xacts list of XIDs, but it was buggy. Commit 955a684e040
fixed that by waiting for *two* xl_running_xacts, such that the second
xl_running_xact doesn't contain any of the XIDs from the first one.

To fix the case mentioned in that comment, where a transaction listed in
xl_running_xacts is in fact already committed or aborted, wouldn't it be
more straightforward to check each XID, if they are in fact already
committed or aborted? The CLOG is up-to-date here, I believe.

I bumped into this, after I noticed that read_local_xlog_page() has a
pretty busy polling loop, with just 1 ms delay to keep it from hogging
CPU. I tried to find the call sites where we might get into that loop,
and found that the snapshot building code to do that: the
'delayed_startup' regression test in contrib/test_decoding exercises it.
In a primary server, SnapBuildWaitSnapshot() inserts a new running-xacts
record, and then read_local_xlog_page() will poll until that record has
been flushed. We could add an explicit XLogFlush() there to avoid the
wait. However, if I'm reading the code correctly, in a standby server,
we can't write a new running-xacts record so we just wait for one that's
created periodically by bgwriter in the primary. That can take several
seconds. Or indefinitely, if the standby isn't connected to the primary
at the moment. Would be nice to not poll.

- Heikki

P.S. There's this in SnapBuildNextPhaseAt():

> /*
> * For backward compatibility reasons this has to be stored in the wrongly
> * named field. Will be fixed in next major version.
> */
> return builder->was_running.was_xmax;

We could fix that now... Andres, what did you have in mind for a proper
name?

P.P.S. Two thinkos in comments in snapbuild.c: s/write for/wait for/.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-01-25 17:38:27 Re: CheckpointLock needed in CreateCheckPoint()?
Previous Message Jan Wieck 2021-01-25 17:17:58 Re: Extensibility of the PostgreSQL wire protocol