Re: More issues with pg_verify_checksums and checksum verification in base backups

From: David Steele <david(at)pgmasters(dot)net>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, michael(at)paquier(dot)xyz
Cc: pgsql-hackers(at)postgresql(dot)org, andres(at)anarazel(dot)de, andrew(at)dunslane(dot)net, daniel(at)yesql(dot)se, magnus(at)hagander(dot)net, tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject: Re: More issues with pg_verify_checksums and checksum verification in base backups
Date: 2018-10-30 18:32:43
Message-ID: 855d2bde-900c-4b3a-689c-e18c76a73d32@pgmasters.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/30/18 11:59 AM, Stephen Frost wrote:
>
> * Kyotaro HORIGUCHI (horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp) wrote:
>>
>> So I'm +1 for the Michael's current patch as (I think) we can't
>> make visible or large changes.
>>
>> That said, I agree with Stephen's concern on the point we could
>> omit requried files in future, but on the other hand I don't want
>> random files are simply rejected.
>
> They aren't rejected- there's a warning thrown about them.

pgBackRest has been using a whitelist/blacklist method for identifying
checksummable files for almost 2 years we haven't seen any issues. The
few times a "random" file appeared in the logs with checksum warnings it
was later identified as having been mistakenly copied into $PGDATA. The
backup still completed successfully in these cases.

So to be clear, we whitelist the global, base, and pg_tblspc dirs and
blacklist PG_VERSION, pg_filenode.map, pg_internal.init, and pg_control
(just for global) when deciding which files to checksum. Recently we
added logic to exclude unlogged and temporary relations as well, though
that's not required.

For PG11 I would recommend just adding the param file generated by exec
backend to the black list for both pg_basebackup and pg_verifychecksums,
then create a common facility for blacklisting for PG12.

I'm not very excited about the idea of encouraging extensions to drop
files in the postgres relation directories (base, global, pg_tblspc).
If we don't say we support it then in my mind that means we don't.
There are lots of ways extension authors could make naming mistakes that
would lead to their files being cleaned up by Postgres at startup or
included in a DROP DATABASE.

I am OK with allowing an extension directory for each tablespace/db dir
where extensions can safe drop files for PG12, if we decide that's
something worth doing.

Regards,
--
-David
david(at)pgmasters(dot)net

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-10-30 19:04:55 Lambda expressions (was Re: BUG #15471)
Previous Message Sanyo Moura 2018-10-30 18:25:39 FDW Parallel Append