Re: contrib/pg_visibility fails regression under CLOBBER_CACHE_ALWAYS

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: contrib/pg_visibility fails regression under CLOBBER_CACHE_ALWAYS
Date: 2021-06-07 12:11:35
Message-ID: CAD21AoBZJuoqFrgHFo3qbFK2v64GbaQZFeYKTKQkbdtzCz7dSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 7, 2021 at 4:30 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Jun 7, 2021 at 11:15 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > husky just reported $SUBJECT:
> >
> > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=husky&dt=2021-06-05%2013%3A42%3A17
> >
> > and I find I can reproduce that locally:
> >
> > diff -U3 /home/postgres/pgsql/contrib/pg_visibility/expected/pg_visibility.out /home/postgres/pgsql/contrib/pg_visibility/results/pg_visibility.out
> > --- /home/postgres/pgsql/contrib/pg_visibility/expected/pg_visibility.out 2021-01-20 11:12:24.854346717 -0500
> > +++ /home/postgres/pgsql/contrib/pg_visibility/results/pg_visibility.out 2021-06-06 22:12:07.948890104 -0400
> > @@ -215,7 +215,8 @@
> > 0 | f | f
> > 1 | f | f
> > 2 | t | t
> > -(3 rows)
> > + 3 | t | t
> > +(4 rows)
> >
> > select * from pg_check_frozen('copyfreeze');
> > t_ctid
> > @@ -235,7 +236,8 @@
> > 0 | t | t
> > 1 | f | f
> > 2 | t | t
> > -(3 rows)
> > + 3 | t | t
> > +(4 rows)
> >
> > select * from pg_check_frozen('copyfreeze');
> > t_ctid
> >
> >
> > The test cases that are failing date back to January (7db0cd2145f),
> > so I think this is some side-effect of a recent commit, but I have
> > no idea which one.
>
> It seems like the recent revert (8e03eb92e9a) is relevant.
>
> After committing 7db0cd2145f we had the same regression test failure
> in January[1]. Then we fixed that issue by 39b66a91b. But since we
> recently reverted most of 39b66a91b, the same issue happened again.
>

So the cause of this failure seems the same as before. The failed test is,

begin;
truncate copyfreeze;
copy copyfreeze from stdin freeze;
1 '1'
2 '2'
3 '3'
4 '4'
5 '5'
\.
copy copyfreeze from stdin;
6 '6'
\.
copy copyfreeze from stdin freeze;
7 '7'
8 '8'
9 '9'
10 '10'
11 '11'
12 '12'
\.
commit;

If the target block cache is invalidated before the third COPY, we
will start to insert the frozen tuple into a new page, resulting in
adding two blocks in total during the third COPY. I think we still
need the following part of the reverted code so that we don't leave
the page partially empty after relcache invalidation:

--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -407,19 +407,19 @@ RelationGetBufferForTuple(Relation relation, Size len,
* target.
*/
targetBlock = GetPageWithFreeSpace(relation, targetFreeSpace);
- }

- /*
- * If the FSM knows nothing of the rel, try the last page before we give
- * up and extend. This avoids one-tuple-per-page syndrome during
- * bootstrapping or in a recently-started system.
- */
- if (targetBlock == InvalidBlockNumber)
- {
- BlockNumber nblocks = RelationGetNumberOfBlocks(relation);
+ /*
+ * If the FSM knows nothing of the rel, try the last page before we
+ * give up and extend. This avoids one-tuple-per-page syndrome during
+ * bootstrapping or in a recently-started system.
+ */
+ if (targetBlock == InvalidBlockNumber)
+ {
+ BlockNumber nblocks = RelationGetNumberOfBlocks(relation);

- if (nblocks > 0)
- targetBlock = nblocks - 1;
+ if (nblocks > 0)
+ targetBlock = nblocks - 1;
+ }
}

Attached the patch that brings back the above change.

Regards,

--
Masahiko Sawada
EDB: https://www.enterprisedb.com/

Attachment Content-Type Size
use_last_block.patch application/octet-stream 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2021-06-07 12:24:53 Re: speed up verifying UTF-8
Previous Message David Rowley 2021-06-07 11:50:04 Re: Tid scan improvements