Re: bad wal on replica / incorrect resource manager data checksum in record / zfs

From: Alex Malek <magicagent(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: bad wal on replica / incorrect resource manager data checksum in record / zfs
Date: 2020-04-06 14:59:47
Message-ID: CAGH8ccd8Fd4j6HFw9m+Qr0w6zLPNzg-d2sDL1gFc-+FDAh+vOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 2, 2020 at 2:10 PM Andres Freund <andres(at)anarazel(dot)de> wrote:

> Hi,
>
> On 2020-02-19 16:35:53 -0500, Alex Malek wrote:
> > We are having a reoccurring issue on 2 of our replicas where replication
> > stops due to this message:
> > "incorrect resource manager data checksum in record at ..."
>
> Could you show the *exact* log output please? Because this could
> temporarily occur without signalling anything bad, if e.g. the
> replication connection goes down.
>

Feb 23 00:02:02 wrds-pgdata10-2-w postgres[68329]: [12491-1] 5e4aac44.10ae9
(@) LOG: incorrect resource manager data checksum in record at
39002/57AC0338

When it occurred replication stopped. The only way to resume replication
was to stop server and remove bad WAL file.

>
>
> > Right before the issue started we did some upgrades and altered some
> > postgres configs and ZFS settings.
> > We have been slowly rolling back changes but so far the the issue
> continues.
> >
> > Some interesting data points while debugging:
> > We had lowered the ZFS recordsize from 128K to 32K and for that week the
> > issue started happening every other day.
> > Using xxd and diff we compared "good" and "bad" wal files and the
> > differences were not random bad bytes.
> >
> > The bad file either had a block of zeros that were not in the good file
> at
> > that position or other data. Occasionally the bad data has contained
> > legible strings not in the good file at that position. At least one of
> > those exact strings has existed elsewhere in the files.
> > However I am not sure if that is the case for all of them.
> >
> > This made me think that maybe there was an issue w/ wal file recycling
> and
> > ZFS under heavy load, so we tried lowering
> > min_wal_size in order to "discourage" wal file recycling but my
> > understanding is a low value discourages recycling but it will still
> > happen (unless setting wal_recycle in psql 12).
>
> This sounds a lot more like a broken filesystem than anythingon the PG
> level.
>

Probably. In my recent updated comment turning off ZFS compression on
master seems to have fixed the issue.
However I will note that the WAL file stored on the master was always fine
upon inspection.

>
>
> > When using replication slots, what circumstances would cause the master
> to
> > not save the WAL file?
>
> What do you mean by "save the WAL file"?
>

Typically, when using replication slots, when replication stops the master
will save the next needed WAL file.
However once or twice when this error occurred the master recycled/removed
the WAL file needed.
I suspect perhaps b/c the replica had started to read the WAL file it sent
some signal to the master that the WAL
file was already consumed. I am guessing, not knowing exactly what is
happening and w/ the caveat that this
situation was rare and not the norm. It is also possible caused by a
different error.

Thanks.
Alex

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message PG Bug reporting form 2020-04-06 15:00:00 BUG #16346: pg_upgrade fails on a trigger with a comment
Previous Message Juan José Santamaría Flecha 2020-04-06 14:47:17 Re: PG compilation error with Visual Studio 2015/2017/2019