Re: Wrong usage of RelationNeedsWAL

From: Noah Misch <noah(at)leadboat(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, andres(at)anarazel(dot)de
Subject: Re: Wrong usage of RelationNeedsWAL
Date: 2021-01-19 09:31:52
Message-ID: 20210119093152.GA1831573@rfd.leadboat.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jan 19, 2021 at 01:48:31PM +0900, Kyotaro Horiguchi wrote:
> At Mon, 18 Jan 2021 17:30:22 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> > At Sun, 17 Jan 2021 23:02:18 -0800, Noah Misch <noah(at)leadboat(dot)com> wrote in
> > > On Sun, Jan 17, 2021 at 10:36:31PM -0800, Noah Misch wrote:
> > > > I wrote the above based on the "PageGetLSN(page) > (snapshot)->lsn" check in
> > > > TestForOldSnapshot(). If the LSN isn't important, what else explains
> > > > RelationAllowsEarlyPruning() checking RelationNeedsWAL()?
> > >
> > > Thinking about it more, some RelationAllowsEarlyPruning() callers need
> > > different treatment. Above, I was writing about the case of deciding whether
> > > to do early pruning. The other RelationAllowsEarlyPruning() call sites are
> > > deciding whether the relation might be lacking old data. For that case, we
> > > should check RELPERSISTENCE_PERMANENT, not RelationNeedsWAL(). Otherwise, we
> > > could fail to report an old-snapshot error in a case like this:
> > >
> A> > setup: CREATE TABLE t AS SELECT ...;
> B> > xact1: BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT 1; -- take snapshot
> C> > xact2: DELETE FROM t;
> D> > (plenty of time passes)
> E> > xact3: SELECT count(*) FROM t; -- early pruning
> F> > xact1: SAVEPOINT q; SELECT count(*) FROM t; ROLLBACK TO q; -- "snapshot too old"
> G> > xact1: ALTER TABLE t SET TABLESPACE something; -- start skipping WAL
> H> > xact1: SELECT count(*) FROM t; -- no error, wanted "snapshot too old"
> > >
> > > Is that plausible?
> >
> > Thank you for the consideration and yes. But I get "snapshot too old"
> > from the last query with the patched version. maybe I'm missing
> > something. I'm going to investigate the case.
>
> Ah. I took that in reverse way. (caught by the discussion on page
> LSN.) Ok, the "current" code works that way. So we need to perform the
> check the new way in RelationAllowsEarlyPruning. So in a short answer
> for the last question in regard to the reference side is yes.

Right. I expect the above procedure shows a bug in git master, but your patch
versions suffice to fix that bug.

> I understand that you are suggesting that at least
> TransactionIdLimitedForOldSnapshots should follow not only relation
> persistence but RelationNeedsWAL, based on the discussion on the
> check on LSN of TestForOldSnapshot().
>
> I still don't think that LSN in the WAL-skipping-created relfilenode
> harm. ALTER TABLE SET TABLESPACE always makes a dead copy of every
> block (except checksum) including page LSN regardless of wal_level. In
> the scenario above, the last select at H correctly reads page LSN set
> by E then copied by G, which is larger than the snapshot LSN at B. So
> doesn't go the fast-return path before actual check performed by
> RelationAllowsEarlyPruning.

I agree the above procedure doesn't have a problem under your patch versions.
See https://postgr.es/m/20210116043816.GA1644261@rfd.leadboat.com for a
different test case. In more detail:

setup: CREATE TABLE t AS SELECT ...;
xact1: BEGIN; DELETE FROM t;
xact2: BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT 1; -- take snapshot
xact1: COMMIT;
(plenty of time passes, rows of t are now eligible for early pruning)
xact3: BEGIN;
xact3: ALTER TABLE t SET TABLESPACE something; -- start skipping WAL
xact3: SELECT count(*) FROM t; -- early pruning w/o WAL, w/o LSN changes
xact3: COMMIT;
xact2: SELECT count(*) FROM t; -- no error, wanted "snapshot too old"

I expect that shows no bug for git master, but I expect it does show a bug
with your patch versions. Could you try implementing both test procedures in
src/test/modules/snapshot_too_old? There's no need to make the test use
wal_level=minimal explicitly; it's sufficient to catch these bugs when
wal_level=minimal is in the TEMP_CONFIG file.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Nancarrow 2021-01-19 09:36:52 Re: Parallel INSERT (INTO ... SELECT ...)
Previous Message Heikki Linnakangas 2021-01-19 09:09:16 Re: ResourceOwner refactoring