Re: Return pg_control from pg_backup_stop().

From: David Steele <david(at)pgbackrest(dot)org>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Haibo Yan <tristan(dot)yim(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>
Subject: Re: Return pg_control from pg_backup_stop().
Date: 2026-04-13 14:55:10
Message-ID: 4193dcbc-591e-44bb-816c-43b4ae70d31c@pgbackrest.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/18/26 19:26, David Steele wrote:
> On 3/18/26 15:26, Michael Paquier wrote:
>> On Wed, Mar 18, 2026 at 07:35:47AM +0000, David Steele wrote:
>>> You are correct -- the copy of pg_control needs to happen before
>>> do_pg_backup_stop(). An older version of this patch saved pg_control in
>>> backup_state which made the prior location safe. However, I missed
>>> moving
>>> this code when I moved pg_control out of backup_state. Code review to
>>> the
>>> rescue.
>>
>> Right.  I am wondering also if the final result would not be better
>> without 0002, actually, focusing only on the "simpler" base backup
>> case through the replication protocol, and you are making a good case
>> in mentioning it as not absolutely mandatory for base backups that are
>> taken through the SQL functions.  One could always tweak the flag
>> manually in the control file based on the contents taken from the data
>> folder.  That's more hairy than writing the entire file, for sure,
>> still possible.
>
> Getting even 01 into PG19 would be a great outcome. This would solve the
> problem of torn pg_control and deleted backup labels for any backups
> made with pg_basebackup and that's going to cover a *lot* of cases.
>
> Established third-party backup solutions that are not based on
> pg_basebackup are generally able to manipulate pg_control so that's not
> as much of a concern, perhaps. It does raise the barrier of entry for
> new backup software if they need to learn to read and validate
> pg_control to avoid a torn copy and set the flag. Patch 02 solves that
> problem in a general way so I still think it adds value for the
> ecosystem -- but we could always discuss that in the PG20 cycle.
>
> Whatever gets committed for PG19 I'll write a followup patch to describe
> the hazards of reading pg_control and generally how to get a good copy.
> However, this will be complicated enough that the best answer will
> likely be to use pg_basebackup or some other reputable backup software.
> I don't love this -- I feel like the low-level interface should be
> usable with such hazards.

I have withdrawn this patch. If anybody wants to pick it up in the
future I'll be happy to rebase it but I think two years is long enough
to maintain a patch that is not getting traction.

We are left with the issue that pg_basebackup backups may contain a torn
copy of pg_control. At the least this should be documented.

It would also be a good idea to document that utilizing the low-level
backup interface requires validating the checksum in pg_control to avoid
a torn copy. This is non-trivial but certainly doable.

Regards,
-David

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Geier 2026-04-13 15:03:11 Re: Reduce build times of pg_trgm GIN indexes
Previous Message David E. Wheeler 2026-04-13 14:34:15 Re: Heads Up: cirrus-ci is shutting down June 1st