Re: Wall shiping replica failed to recover database with error: invalid contrecord length 1956 at FED/38FFE208

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: sfrost(at)snowman(dot)net, zeleny(dot)ales(at)gmail(dot)com, pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Wall shiping replica failed to recover database with error: invalid contrecord length 1956 at FED/38FFE208
Date: 2019-11-14 17:28:13
Message-ID: 30577.1573752493@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com> writes:
> At Wed, 2 Oct 2019 19:24:02 -0400, Stephen Frost <sfrost(at)snowman(dot)net> wrote in
>> * Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
>>> Yeah, those messages are all pretty ancient, from when WAL was new and not
>>> to be trusted much. Perhaps the thing to do is move the existing info
>>> into DETAIL and make the primary message be something like "reached
>>> apparent end of WAL stream".

>> Yes, +1 on that.

> What do you think about this?

It seems overcomplicated. Why do you need to reach into
emode_for_corrupt_record's private state? I think we could just
change the message text without that, and leave the emode
machinations as-is.

I don't understand the restriction to "if (RecPtr == InvalidXLogRecPtr)"
either? Maybe that's fine but the comment fails to explain it.

Another point is that, as the comment for emode_for_corrupt_record
notes, we don't really want to consider that we've hit end-of-WAL
when reading any source except XLOG_FROM_PG_WAL. So I think the
change of message ought to include a test of the source, but this
doesn't.

Also, the business with an "operation" string violates the message
translatability guideline about "don't assemble a message out of
phrases". If we want to have that extra detail, it's better just
to make three separate ereport() calls with separately translatable
messages.

Also, it seems wrong that the page TLI check, just below, is where
it is and isn't part of the main set of page header sanity checks.
That's sort of unrelated to this patch, except not really, because
shouldn't a failure of that test also be treated as an "end of WAL"
condition?

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ron 2019-11-14 18:49:03 Query which shows FK child columns?
Previous Message github kran 2019-11-14 17:23:24 PostGreSQL Replication and question on maintenance