Re: pgsql: Add contrib/pg_walinspect.

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Noah Misch <noah(at)leadboat(dot)com>, Jeff Davis <jdavis(at)postgresql(dot)org>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pgsql: Add contrib/pg_walinspect.
Date: 2022-04-27 01:10:50
Message-ID: CA+hUKG+H_VEBdtK4CVb7uRLaAKbufNOMy-djUsptcqhLxONMmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Wed, Apr 27, 2022 at 12:25 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> > I think it's a bug in pg_walinspect, so I'll move the discussion back
> > here. Here's one rather simple way to fix it, that has survived
> > running the test a thousand times (using a recipe that failed for me
> > quite soon, after 20-100 attempts or so; I never figured out how to
> > get the 50% failure rate reported by Tom).
>
> Not sure what we're doing differently, but plain "make check" in
> contrib/pg_walinspect fails pretty consistently for me on gcc23.
> I tried it again just now and got five failures in five attempts.

I tried on the /home filesystem (a slow NFS mount) and then inside a
directory on /tmp to get ext4 (I saw that Noah had somehow got onto a
local filesystem, based on the present of "ext4" in the pathname and I
was trying everything I could think of). I used what I thought might
be some relevant starter configure options copied from the animal:

./configure --prefix=$HOME/install --enable-cassert --enable-debug
--enable-tap-tests CC="ccache gcc -mips32r2" CFLAGS="-O2
-funwind-tables" LDFLAGS="-rdynamic"

For me, make check always succeeds in contrib/pg_walinspect. For me,
make installcheck fails if I do it enough times in a loop, somewhere
around the 20th loop or so, which I imagine has to do with WAL page
boundaries moving around.

for i in `seq 1 1000` ; do
make -s installcheck || exit 1
done

> I then installed your patch and got the same failure, three times
> out of three, so I don't think we're there yet.

Hrmph... Are you sure you rebuilt the contrib module? Assuming so,
maybe it's failing in a different way for you and me. For me, it
always fails after this break is reached in xlogutil.c:

/* If asked, let's not wait for future WAL. */
if (!wait_for_wal)
break;

If you add a log message there, do you see that? For me, the patch
fixes it, because it teaches pg_walinspect that messageless errors are
a way of detecting end-of-data (due to the code above, introduced by
the pg_walinspect commit).

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tom Lane 2022-04-27 01:47:53 Re: pgsql: Add contrib/pg_walinspect.
Previous Message Tom Lane 2022-04-27 00:25:29 Re: pgsql: Add contrib/pg_walinspect.

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2022-04-27 01:25:40 Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL
Previous Message Tom Lane 2022-04-27 00:25:29 Re: pgsql: Add contrib/pg_walinspect.