Re: prevent immature WAL streaming

From: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: masao(dot)fujii(at)oss(dot)nttdata(dot)com, andres(at)anarazel(dot)de, pgsql-hackers(at)lists(dot)postgresql(dot)org, bossartn(at)amazon(dot)com, mengjuan(dot)cmj(at)alibaba-inc(dot)com, Jakub(dot)Wartak(at)tomtom(dot)com
Subject: Re: prevent immature WAL streaming
Date: 2021-09-03 13:59:29
Message-ID: 202109031359.by2swmsx6ha2@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2021-Sep-03, Kyotaro Horiguchi wrote:

> At Thu, 2 Sep 2021 18:43:33 -0400, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote in

> 0002:
> > + XLogRecPtr abortedContrecordPtr; /* LSN of incomplete record at end of
> > + * WAL */
>
> The name sounds like the start LSN. doesn't contrecordAbort(ed)Ptr work?

I went over various iterations of the name of this, and still not
entirely happy. I think we need to convey the ideas that

* This is the endptr+1 of the known-good part of the record, that is,
the beginning of the next part of the record. I think "endPtr"
summarizes this well; we use this name elsewhere.

* At some point before recovery, this was the last WAL record that
existed

* there is an invalid contrecord, or we were looking for a contrecord
and found invalid data

* this record is incomplete

So maybe
1. incompleteRecEndPtr
2. finalInvalidRecEndPtr
3. brokenContrecordEndPtr

> > if (!(pageHeader->xlp_info & XLP_FIRST_IS_CONTRECORD))
> > {
> > report_invalid_record(state,
> > "there is no contrecord flag at %X/%X",
> > LSN_FORMAT_ARGS(RecPtr));
> > - goto err;
> > + goto aborted_contrecord;
>
> This loses the exclusion check between XLP_FIRST_IS_CONTRECORD and
> _IS_ABROTED_PARTIAL. Is it okay? (I don't object to remove the check.).

Yeah, I was unsure about that. I think it's good to have it as a
cross-check, though it should never occur. I'll put it back.

Another related point is whether it's a good idea to have the ereport()
about the bit appearing in a not-start-of-page address being a PANIC.
If we degrade to WARNING then it'll be lost in the noise, but I'm not
sure what else can we do. (If it's a PANIC, then you end up with an
unusable database).

> I didin't thought this as an aborted contrecord. but on second
> thought, when we see a record broken in any style, we stop recovery at
> the point. I agree to the change and all the silmiar changes.
>
> + /* XXX should we goto aborted_contrecord here? */
>
> I think it should be aborted_contrecord.
>
> When that happens, the loaded bytes actually looked like the first
> fragment of a continuation record to xlogreader, even if the cause
> were a broken total_len. So if we abort the record there, the next
> time xlogreader will meet XLP_FIRST_IS_ABORTED_PARTIAL at the same
> page, and correctly finds a new record there.
>
> On the other hand if we just errored-out there, we will step-back to
> the beginning of the broken record in the previous page or segment
> then start writing a new record there but that is exactly what we want
> to avoid now.

Hmm, yeah, makes sense.

--
Álvaro Herrera 39°49'30"S 73°17'W — https://www.EnterpriseDB.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-09-03 14:12:49 Re: replay of CREATE TABLESPACE eats data at wal_level=minimal
Previous Message Amit Langote 2021-09-03 13:46:23 Re: Multi-Column List Partitioning