Re: Make mesage at end-of-recovery less scary.

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: aleksander(at)timescale(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, vignesh21(at)gmail(dot)com, stark(dot)cfm(at)gmail(dot)com, alvherre(at)alvh(dot)no-ip(dot)org, pryzby(at)telsasoft(dot)com, andres(at)anarazel(dot)de, ashu(dot)coek88(at)gmail(dot)com, pashkin(dot)elfe(at)gmail(dot)com, michael(at)paquier(dot)xyz, bossartn(at)amazon(dot)com, david(at)pgmasters(dot)net, peter(dot)eisentraut(at)2ndquadrant(dot)com, jtc331(at)gmail(dot)com, robertmhaas(at)gmail(dot)com
Subject: Re: Make mesage at end-of-recovery less scary.
Date: 2024-01-16 02:57:03
Message-ID: 20240116.115703.2298339297458587094.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thank you for the comments.

At Fri, 12 Jan 2024 15:03:26 +0300, Aleksander Alekseev <aleksander(at)timescale(dot)com> wrote in
> ```
> + p = (char *) record;
> + pe = p + XLOG_BLCKSZ - (RecPtr & (XLOG_BLCKSZ - 1));
> +
> + while (p < pe && *p == 0)
> + p++;
> +
> + if (p == pe)
> ```
>
> Just as a random thought: perhaps we should make this a separate
> function, as a part of src/port/. It seems to me that this code could
> benefit from using vector instructions some day, similarly to
> memcmp(), memset() etc. Surprisingly there doesn't seem to be a
> standard C function for this. Alternatively one could argue that one
> cycle doesn't make much code to reuse and that the C compiler will
> place SIMD instructions for us. However a counter-counter argument
> would be that we could use a macro or even better an inline function
> and have the same effect except getting a slightly more readable code.

Creating a function with a name like memcmp_byte() should be
straightforward, but implementing it with SIMD right away seems a bit
challenging. Similar operations are already being performed elsewhere
in the code, probably within the stats collector, where memcmp is used
with a statically allocated area that's filled with zeros. If we can
achieve a performance equivalent to memcmp with this new function,
then it definitely seems worth pursuing.

> ```
> - * This is just a convenience subroutine to avoid duplicated code in
> + * This is just a convenience subroutine to avoid duplicate code in
> ```
>
> This change doesn't seem to be related to the patch. Personally I
> don't mind it though.

Ah, I'm sorry. That was something I mistakenly thought I had written
at the last moment and made modifications to.

> All in all I find v28 somewhat scary. It does much more than "making
> one message less scary" as it was initially intended and what bugged
> me personally, and accordingly touches many more places including
> xlogreader.c, xlogrecovery.c, etc.
>
> Particularly I have mixed feeling about this:
>
> ```
> + /*
> + * Consider it as end-of-WAL if all subsequent bytes of this page
> + * are zero. We don't bother checking the subsequent pages since
> + * they are not zeroed in the case of recycled segments.
> + */
> ```
>
> If I understand correctly, if somehow several FS blocks end up being
> zeroed (due to OS bug, bit rot, restoring from a corrupted for
> whatever reason backup, hardware failures, ...) there is non-zero
> chance that PG will interpret this as a normal situation. To my
> knowledge this is not what we typically do - typically PG would report
> an error and ask a human to figure out what happened. Of course the
> possibility of such a scenario is small, but I don't think that as
> DBMS developers we can ignore it.

For now, let me explain the basis for this patch. The fundamental
issue is that these warnings that always appear are, in practice, not
a problem in almost all cases. Some of those who encounter them for
the first time may feel uneasy and reach out with inquiries. On the
other hand, those familiar with these warnings tend to ignore them and
only pay attention to details when actual issues arise. Therefore, the
intention of this patch is to label them as "no issue" unless a
problem is blatantly evident, in order to prevent unnecessary concern.

> Does anyone agree or maybe I'm making things up?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-01-16 03:08:36 Re: Revise the Asserts added to bimapset manipulation functions
Previous Message Sutou Kouhei 2024-01-16 02:53:00 Re: Make COPY format extendable: Extract COPY TO format implementations