Re: Add system identifier to backup manifest

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amul Sul <sulamul(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Add system identifier to backup manifest
Date: 2024-02-01 18:32:57
Message-ID: CA+Tgmoa8okr1YCFcCOGgvOTASiXsJgx-oqrCS41Wir=oC8c-mA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Feb 1, 2024 at 2:18 AM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> I intended to minimize the out param of parse_manifest_file(), which currently
> returns manifest_files_hash and manifest_wal_range, and I need two more --
> manifest versions and the system identifier.

Sure, but you could do context.ht = manifest_data->files instead of
context.manifest = manifest_data. The question isn't whether you
should return the whole manifest_data from parse_manifest_file -- I
agree with that decision -- but rather whether you should feed the
whole thing through into the context, or just the file hash.

> Yeah, we can do that, but I think it is a bit inefficient to have strcmp()
> check for the pg_control file on each verify_backup_file() call, despite, we
> know that path. Also, I think, we need additional handling to ensure that the
> system identifier has been verified in verify_backup_file(), what if the
> pg_control file itself missing from the backup -- might be a rare case, but
> possible.
>
> For now, we can do the system identifier validation after
> verify_backup_directory().

Yes, that's another option, but I don't think it's as good.

Suppose you do it that way. Then what will happen when the file is
altogether missing or inaccessible? I think verify_backup_file() will
complain, and then you'll have to do something ugly to avoid having
verify_system_identifier() emit the same complaint all over again.
Remember, unless you find some complicated way of passing data around,
it won't know whether verify_backup_file() emitted a warning or not --
it can redo the stat() and see what happens, but it's not absolutely
guaranteed to be the same as what happened before. Making sure that
you always emit any given complaint once rather than twice or zero
times is going to be tricky.

It seems more natural to me to just accept the (small) cost of a
strcmp() in verify_backup_file(). If the initial stat() fails, it
emits whatever complaint is appropriate and returns and the logic to
check the system identifier is never reached. If it succeeds, you can
proceed to try to open the file and do what you need to do.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2024-02-01 18:32:59 Re: pgbench - adding pl/pgsql versions of tests
Previous Message vignesh C 2024-02-01 18:31:11 Re: Where can I find the doxyfile?