Re: pgsql: Add TAP tests for pg_verify_checksums

From: Andres Freund <andres(at)anarazel(dot)de>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, Postgres hackers <pgsql-hackers(at)postgresql(dot)org>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: pgsql: Add TAP tests for pg_verify_checksums
Date: 2018-10-19 18:30:50
Message-ID: 20181019183050.6fgqc5ugnavfzae4@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On 2018-10-19 14:11:03 -0400, Stephen Frost wrote:
> Greetings,
>
> * Andres Freund (andres(at)anarazel(dot)de) wrote:
> > On 2018-10-19 13:52:28 -0400, Stephen Frost wrote:
> > > > > I also categorically disagree with the notion that it's ok for
> > > > > extensions to dump files into our tablespace diretories or that we
> > > > > should try to work around random code dumping extra files in the
> > > > > directories which we maintain- it's not like this additional code will
> > > > > actually protect us from that, after all, and it's foolish to think we
> > > > > really could protect against that.
> > > >
> > > > I'm unconvinced. There already are extensions doing so, and it's not
> > > > like we've given them any sort of reasonable alternative. You can't just
> > > > create relations in a "registered" / "supported" kinda way right now, so
> > > > uh, yea, extension gotta make do. And that has worked for many years.
> > >
> > > I don't agree that any of this is an argument for accepting random code
> > > writing into arbitrary places in the data directory or tablespace
> > > directories as being something which we support or which we write code
> > > to work-around.
> > >
> > > If this is a capability that extension authors need then I'm all for
> > > having a way for them to have that capability in a clearly defined and
> > > documented way- and one which we could actually write code to handle and
> > > work with.
> >
> > cstore e.g. does this, and it's already out there. So yes, we should
> > provide better infrastructure. But for now, we gotta deal with reality,
> > unless we just want to gratuituously break things.
>
> No, I don't agree that we are beholden to an external extension that
> decided to start writing into directories that clearly belong to PG.

Did it have an alternative?

> How do we even know that what cstore does in the tablespace directory
> wouldn't get caught up in the checksum file pattern-matching that this
> commit put in?

You listen to people?

> What if there was an extension which did create files that would
> match, what would we do then? I'm happy to go create one, if that'd
> help make the point that this "pattern whitelist" isn't actually a
> solution but is really rather just a hack that'll break in the future.

Oh, for crying out loud. Yes, obviously you can create conflicts, nobody
ever doubted that. How on earth is that a useful point for anything? The
point isn't that it's a good idea for extensions to do so, but that it
has happened because we didn't provide better alternative.

> > > > And I fail to see why a blacklist is architecturally better. There's
> > > > plenty cases where we might want to create temporary files,
> > > > non-checksummed data or such that we'd need a teach a blacklist about,
> > > > but there's far fewer cases where add new checksummed files. Basically
> > > > never since checksums have been introduced. And if checksums were
> > > > introduced for something new, say slrus, we'd ceertainly use
> > > > pg_verify_checksum during development.
> > >
> > > In cases where we need something temporary, we're going to need to be
> > > prepared to clean those up and we had better make it very plain that
> > > they're temporary and easy to write code for. Anything we aren't
> > > prepared to blow away on a crash-restart should be checksum'd and in an
> > > ideal world, we'd be looking to reduce the blacklist to only those
> > > things which are temporary.
> >
> > There's pending patches that add support for pg_verify_checksums running
> > against a running cluster. We'll not just need to recognize files that
> > are there after a graceful shutdown, but with anything that a cluster
> > can have there while running.
>
> Of course- the same is true with a crash/restart case, so I'm not sure
> what you're getting at here.

pg_verify_checksum doesn't support running on a crashed cluster, and I'm
not sure it'd make sense to teach it to so (there's not really much it
could do to discern whether a block is a torn page that replay will fix,
or corrupted).

> I wasn't ever thinking of only the graceful shutdown case- certainly
> pg_basebackup operates when the cluster is running also and that's
> more what I was thinking about from the start and which is part of
> what started the discussion about this commit as that was entirely
> ignored in this change.

It was mainly ignored in the original pg_verify_checksums change, not in
the commit discussed here. That was broken. That built separate logic.

Both basebackup an verify checksums already only scan certain
directories. They *already* are encoding directory structure
information.

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Stephen Frost 2018-10-19 18:47:51 Re: pgsql: Add TAP tests for pg_verify_checksums
Previous Message Andres Freund 2018-10-19 18:21:27 Re: pgsql: Add TAP tests for pg_verify_checksums

Browse pgsql-hackers by date

  From Date Subject
Next Message Aleksandr Parfenov 2018-10-19 18:31:04 Re: Optimze usage of immutable functions as relation
Previous Message Andres Freund 2018-10-19 18:21:27 Re: pgsql: Add TAP tests for pg_verify_checksums