| From: | David Steele <david(at)pgbackrest(dot)org> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | Haibo Yan <tristan(dot)yim(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
| Subject: | Re: Return pg_control from pg_backup_stop(). |
| Date: | 2026-04-13 14:55:10 |
| Message-ID: | 4193dcbc-591e-44bb-816c-43b4ae70d31c@pgbackrest.org |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 3/18/26 19:26, David Steele wrote:
> On 3/18/26 15:26, Michael Paquier wrote:
>> On Wed, Mar 18, 2026 at 07:35:47AM +0000, David Steele wrote:
>>> You are correct -- the copy of pg_control needs to happen before
>>> do_pg_backup_stop(). An older version of this patch saved pg_control in
>>> backup_state which made the prior location safe. However, I missed
>>> moving
>>> this code when I moved pg_control out of backup_state. Code review to
>>> the
>>> rescue.
>>
>> Right. I am wondering also if the final result would not be better
>> without 0002, actually, focusing only on the "simpler" base backup
>> case through the replication protocol, and you are making a good case
>> in mentioning it as not absolutely mandatory for base backups that are
>> taken through the SQL functions. One could always tweak the flag
>> manually in the control file based on the contents taken from the data
>> folder. That's more hairy than writing the entire file, for sure,
>> still possible.
>
> Getting even 01 into PG19 would be a great outcome. This would solve the
> problem of torn pg_control and deleted backup labels for any backups
> made with pg_basebackup and that's going to cover a *lot* of cases.
>
> Established third-party backup solutions that are not based on
> pg_basebackup are generally able to manipulate pg_control so that's not
> as much of a concern, perhaps. It does raise the barrier of entry for
> new backup software if they need to learn to read and validate
> pg_control to avoid a torn copy and set the flag. Patch 02 solves that
> problem in a general way so I still think it adds value for the
> ecosystem -- but we could always discuss that in the PG20 cycle.
>
> Whatever gets committed for PG19 I'll write a followup patch to describe
> the hazards of reading pg_control and generally how to get a good copy.
> However, this will be complicated enough that the best answer will
> likely be to use pg_basebackup or some other reputable backup software.
> I don't love this -- I feel like the low-level interface should be
> usable with such hazards.
I have withdrawn this patch. If anybody wants to pick it up in the
future I'll be happy to rebase it but I think two years is long enough
to maintain a patch that is not getting traction.
We are left with the issue that pg_basebackup backups may contain a torn
copy of pg_control. At the least this should be documented.
It would also be a good idea to document that utilizing the low-level
backup interface requires validating the checksum in pg_control to avoid
a torn copy. This is non-trivial but certainly doable.
Regards,
-David
| From | Date | Subject | |
|---|---|---|---|
| Next Message | David Geier | 2026-04-13 15:03:11 | Re: Reduce build times of pg_trgm GIN indexes |
| Previous Message | David E. Wheeler | 2026-04-13 14:34:15 | Re: Heads Up: cirrus-ci is shutting down June 1st |