Re: More issues with pg_verify_checksums and checksum verification in base backups

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org, andres(at)anarazel(dot)de, andrew(at)dunslane(dot)net, daniel(at)yesql(dot)se, magnus(at)hagander(dot)net, tgl(at)sss(dot)pgh(dot)pa(dot)us
Subject: Re: More issues with pg_verify_checksums and checksum verification in base backups
Date: 2018-10-30 15:59:21
Message-ID: 20181030155921.GS4184@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Kyotaro HORIGUCHI (horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp) wrote:
> At Wed, 24 Oct 2018 14:31:37 +0900, Michael Paquier <michael(at)paquier(dot)xyz> wrote in <20181024053137(dot)GL1658(at)paquier(dot)xyz>
> > On Sun, Oct 21, 2018 at 08:56:32PM -0400, Stephen Frost wrote:
> > > All of this pie-in-the-sky about what pluggable storage might have is
> > > just hand-waving, in my opinion, and not worth much more than that. I
> > > hope (and suspect..) that the actual pluggable storage that's being
> > > worked on doesn't do any of this "just drop a file somewhere" because
> > > there's a lot of downsides to it- and if it did, it wouldn't be much
> > > more than what we can do with an FDW, so why go through and add it?
> >
> > Well, there is no point in enforcing a rule that something is forbidden
> > if if was never implied and never documented (the rule here is to be
> > able to drop custom files into global/, base/ or pg_tblspc.). Postgres
> > is highly-customizable, I would prefer if features in core are designed
> > so as we keep things extensible, the checksum verification for base
> > backup on the contrary restricts things.
> >
> > So, do we have other opinions about this thread?
>
> I believe of freedom for anyone of dropping any files into
> anywhere in $PGDATA if he thinks it sefe for the time
> being. Changes in core sometimes breaks some extensions and they
> used to be 'fixed' in the manner or claim for a replacement to
> core. This is the same for changes of (undocumented) APIs. I
> think things have been going around in this manner for years and
> I don't think core side is unable to define a definite border of
> what extensions are allowed to do since we are not knowledgeable
> of all extensions that will be created in future or that have
> created.
>
> So I'm +1 for the Michael's current patch as (I think) we can't
> make visible or large changes.
>
>
> That said, I agree with Stephen's concern on the point we could
> omit requried files in future, but on the other hand I don't want
> random files are simply rejected.

They aren't rejected- there's a warning thrown about them.

> So I propose to sort files into roughly three categories. One is
> files we know of but to skip. Second is files we know of and need
> checksum verification. The last is the files we just don't know of.

That last set really should be empty though.

> In the attached PoC (checksum_)find_file_type categorizes a file
> into 6 (or 7) categories.

I'm certainly not thrilled with the idea of adding a bunch more code to
v11 for this.

> ENTRY_TO_IGNORE: to be always ignored, for "." and "..".
> FILE_TO_SKIP: we know of and to skip. files in the black list.
> DIR_TO_SKIP: directories same to the above.
> DIR_TO_SCAN: we know any file to scan may be in it.
> HEAP_TO_SCAN: we know it has checksums in heap format.
> FILE_UNKNOWN: we don't know of.
> (STAT_FAILED: stat filed for the file)
>
> Among these types, what are related to the discussion above are
> FILE_TO_SKIP, HEAP_TO_SCAN and FILE_UNKNOWN. Others are for
> controlling search loop in pg_verify_checksums.
>
> Based on this categorization, pg_verify_checksum's result is shown as:
>
> > Checksum scan completed
> > Data checksum version: 1
> > Files scanned: 1095
> > Blocks scanned: 2976
> > Bad checksums: 0
> + Files skipped: 8
> + Unknown files skipped: 1
> + Skipped directories: 1

I'd rather those files be shown as a warning than just listed as
'Unknown'. Previously, throwing a warning is exactly what we did and
seems like the most sensible thing to do when we come across random
files which have shown up in the data directory that we don't recognize
or understand.

> (Data checksum version is bonded with heap checksums..)
>
> If this sort of change is acceptable for us, I believe it
> comforms everyone here's wishes. If skipped unknown is not zero
> on a buildfarm animal, it is a sign of something is forgotten.

Having a buildfarm checking would certainly be good, but I'm really not
sure that the added complexity here makes sense. If we're going to go
down this road, it also seems like it'd make sense to complain about
things that we don't recognize in other directories too.

> The second attached (also PoC) is common'ize the file sorter. The
> function is moved to src/common/file_checksums.c and both
> pg_verify_checksums.c and basebackup.c uses it. With
> log_min_messages=debug1, you will see the following message for
> unkown files during basebackup.

I'm definitely on-board with figuring out a way to provide more
inspection of the data directory through src/common functions that can
be leveraged by the frontend and backend tools, as well as other tools.

> > DEBUG: checksum verification was skipped for unknown file: ./base/hoge
>
> This changed one behavior of the tool. -r now can take
> 'dbid/relid' addition to just 'relid'. I think we could have
> .verify_checksum.exlucde file so that extensions can declare
> files.

Generally speaking, if we really want to support this, then we would
need to have some amount of documented "this is what extensions are
allowed to do, and what they aren't" and I'd probably say we would want
to have a directory for extensions to drop their files into, and then
have that directory be split up by extension name (and maybe version..?)
and then we could just skip the entire extension directory. Having
flags or parameters around for users to provide a list of files to
ignore because some extension created those files seems like a really
poor approach.

I didn't reallly look at the patches you sent much, just to be clear.

I did discuss this with some folks at PGConf EU and encouraged them to
comment as this thread too. Hopefully some will.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2018-10-30 16:02:10 Re: [HACKERS] generated columns
Previous Message Axel Rau 2018-10-30 15:51:00 Re: Getting fancy errors when accessing information_schema on 10.5