Re: cleanup patches for incremental backup

From: Nathan Bossart <nathandbossart(at)gmail(dot)com>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: cleanup patches for incremental backup
Date: 2024-01-27 16:31:09
Message-ID: 20240127163109.GA3047116@nathanxps13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Jan 27, 2024 at 11:00:01AM +0300, Alexander Lakhin wrote:
> 24.01.2024 20:46, Robert Haas wrote:
>> This is weird. There's a little more detail in the log file,
>> regress_log_002_blocks, e.g. from the first failure you linked:
>>
>> [11:18:20.683](96.787s) # before insert, summarized TLI 1 through 0/14E09D0
>> [11:18:21.188](0.505s) # after insert, summarized TLI 1 through 0/14E0D08
>> [11:18:21.326](0.138s) # examining summary for TLI 1 from 0/14E0D08 to 0/155BAF0
>> # 1
>> ...
>> [11:18:21.349](0.000s) # got: 'pg_walsummary: error: could
>> not open file "/home/nm/farm/gcc64/HEAD/pgsql.build/src/bin/pg_walsummary/tmp_check/t_002_blocks_node1_data/pgdata/pg_wal/summaries/0000000100000000014E0D0800000000155BAF0
>> # 1.summary": No such file or directory'
>>
>> The "examining summary" line is generated based on the output of
>> pg_available_wal_summaries(). The way that works is that the server
>> calls readdir(), disassembles the filename into a TLI and two LSNs,
>> and returns the result.
>
> I'm discouraged by "\n1" in the file name and in the
> "examining summary..." message.
> regress_log_002_blocks from the following successful test run on the same
> sungazer node contains:
> [15:21:58.924](0.106s) # examining summary for TLI 1 from 0/155BAE0 to 0/155E750
> [15:21:58.925](0.001s) ok 1 - WAL summary file exists

Ah, I think this query:

SELECT tli, start_lsn, end_lsn from pg_available_wal_summaries()
WHERE tli = $summarized_tli AND end_lsn > '$summarized_lsn'

is returning more than one row in some cases. I attached a quick sketch of
an easy way to reproduce the issue as well as one way to fix it.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
repro_and_fix.patch text/x-diff 897 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-01-27 18:29:26 Re: Segmentation fault on FreeBSD with GSSAPI authentication
Previous Message Michał Kłeczek 2024-01-27 15:20:26 Re: Segmentation fault on FreeBSD with GSSAPI authentication