Re: pg15b3: recovery fails with wal prefetch enabled

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, "Shinoda, Noriyoshi (PN Japan FSIP)" <noriyoshi(dot)shinoda(at)hpe(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg15b3: recovery fails with wal prefetch enabled
Date: 2022-09-05 09:08:16
Message-ID: CA+hUKG+vsm00hYtwSyQX9T2cw3qX_d6ZDgAtT9Z+4reO4ePt8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 5, 2022 at 5:34 PM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
> At Mon, 05 Sep 2022 14:15:27 +0900 (JST), Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> wrote in
> me> +1 for showing any message for the failure, but I think we shouldn't
> me> hide an existing message if any.
>
> At Mon, 5 Sep 2022 16:54:07 +1200, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote in
> > On reflection, it'd be better not to clobber any pre-existing error
> > there, but report one only if there isn't one already queued. I've
> > done that in this version, which I'm planning to do a bit more testing
> > on and commit soonish if there are no comments/objections, especially
> > for that part.
>
> It looks fine in this regard. I still think that the message looks
> somewhat internal.

Thanks for looking!

> me> And the error messages around are
> me> just telling that "<some error happened> at RecPtr". So I think
> me> "missing contrecord at RecPtr" is sufficient here.

Ok, I've updated it like that.

> > I'll have to check whether a doc change is necessary somewhere to
> > advertise that maintenance_io_concurrency=0 turns off prefetching, but
> > IIRC that's kinda already implied.
> >
> > I've tested quite a lot of scenarios including make check-world with
> > maintenance_io_concurrency = 0, 1, 10, 1000, and ALTER SYSTEM for all
> > relevant GUCs on a standby running large pgbench to check expected
> > effect on pg_stat_recovery_prefetch view and generate system calls.
>
> + if (likely(record = prefetcher->reader->record))
>
> Isn't this confusing a bit?
>
>
> + if (likely(record = prefetcher->reader->record))
> + {
> + XLogRecPtr replayed_up_to = record->next_lsn;
> +
> + XLogReleasePreviousRecord(prefetcher->reader);
> +
>
> The likely condition is the prerequisite for
> XLogReleasePreviousRecord. But is is a little hard to read the
> condition as "in case no previous record exists". Since there is one
> in most cases, can't call XLogReleasePreviousRecord() unconditionally
> then the function returns true when the previous record exists and
> false if not?

We also need the LSN that is past that record.
XLogReleasePreviousRecord() could return it (or we could use
reader->EndRecPtr I suppose). Thoughts on this version?

Attachment Content-Type Size
v3-0001-Fix-recovery_prefetch-with-low-maintenance_io_con.patch text/x-patch 7.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2022-09-05 09:23:47 Re: Missing CFI in iterate_word_similarity()
Previous Message Kyotaro Horiguchi 2022-09-05 08:32:20 Re: Patch to address creation of PgStat* contexts with null parent context