Re: incremental backup breakage in BlockRefTableEntryGetBlocks

From: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: incremental backup breakage in BlockRefTableEntryGetBlocks
Date: 2024-04-05 06:59:02
Message-ID: CAKZiRmzZWayaa3Uaxds7-tWqpbtrGDFqrxOTa8Zd-SsxrfrgFA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 4, 2024 at 9:11 PM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 4/4/24 19:38, Robert Haas wrote:
> > Hi,
> >
> > Yesterday, Tomas Vondra reported to me off-list that he was seeing
> > what appeared to be data corruption after taking and restoring an
> > incremental backup. Overnight, Jakub Wartak further experimented with
> > Tomas's test case, did some initial analysis, and made it very easy to
> > reproduce. I spent this morning tracking down the problem, for which I
> > attach a patch.
> >
>
> Thanks, I can confirm this fixes the issue I've observed/reported. On
> master 10 out of 10 runs failed, with the patch no failures.

Same here, patch fixes it on recent master. I've also run pgbench for
~30mins and compared master and incremental and got 0 differences,
should be good.

> The test is very simple:
>
> 1) init pgbench

Tomas had magic fingers here - he used pgbench -i -s 100 which causes
bigger relations (it wouldn't trigger for smaller -s values as Robert
explained - now it makes full sense; in earlier tests I was using much
smaller -s , then transitioned to other workloads (mostly append
only), and final 100GB+/24h+ tests used mostly INSERTs rather than
UPDATEs AFAIR). The other interesting thing is that one of the animals
runs with configure --with-relsegsize=<somesmallvalue> (so new
relations are full much earlier) and it was not catched there either -
Wouldn't it be good idea to to test in src/test/recover/ like that?

And of course i'm attaching reproducer with some braindump notes in
case in future one hits similiar issue and wonders where to even start
looking (it's very primitive though but might help).

-J.

Attachment Content-Type Size
reproduce_incremental_bug_04_04_2024_v1.txt text/plain 8.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2024-04-05 07:09:29 Re: remaining sql/json patches
Previous Message Anton A. Melnikov 2024-04-05 06:30:02 Re: effective_multixact_freeze_max_age issue