Re: pgsql: Add TAP tests for pg_verify_checksums

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Magnus Hagander <magnus(at)hagander(dot)net>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, Postgres hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pgsql: Add TAP tests for pg_verify_checksums
Date: 2018-10-20 04:57:23
Message-ID: CAOuzzgqH=J2QPxhQKAxh+Fbn6Uw-_x3rPkJbcFFbUh50qm0v8A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Greetings,

On Sat, Oct 20, 2018 at 00:43 Michael Paquier <michael(at)paquier(dot)xyz> wrote:

> On Sat, Oct 20, 2018 at 12:25:19AM -0400, Stephen Frost wrote:
> > On Fri, Oct 19, 2018 at 23:40 Michael Paquier <michael(at)paquier(dot)xyz>
> wrote:
> >> - I agree with not doing a simple revert to not turn the buildfarm red
> >> again. This is annoying for animal maintainers. Andrew has done a very
> >> nice work in disabling manually those tests temporarily.
> >
> > This is a red herring, and always was, so I’m rather unimpressed at how
> it
> > keeps coming up- no, I’m not advocating that we should just make the
> build
> > farm red and just leave it that way. Yes, we should fix this case, and
> fix
> > pg_basebackup, and maybe even try to add some regression tests which test
> > this exact same case in pg_basebackup, but making the build farm green is
> > *not* the only thing we should care about.
>
> Well, the root of the problem was that pg_verify_checksums has been
> committed without any tests on its own. If we had those tests from the
> start, then we would not be having this discussion post-release time,
> still trying to figure out if whitelisting or blacklisting is
> appropriate.
>
> The validation checksums in base backups has been added in 4eb77d50, a
> couple of days before pg_verify_checksums has been introduced. This has
> added corruption-related tests in src/bin/pg_basebackup, which is a good
> thing. However the feature has been designed so as checksum mismatches
> are ignored after 5 failures, which actually *masked* the fact that
> EXEC_BACKEND files like CONFIG_EXEC_PARAMS should have been skipped
> instead of getting checksum failures. And that's a bad thing. So this
> gives in my opinion a good point about using a whitelist.

That counter of checksum failures should have been getting reset between
files.. that’s certainly what I had understood was intended. The idea of
the counter is to not flood the log with errors when a file is discovered
that’s full of checksum failures (as can happen if large chunks of the file
got replaced with something else, for example), but it should be reporting
on each file that does have failures in it.

I don’t see how having a whitelist changes that or would have impacted that
logic to make it correct initially or not.

I’m also trying to understand how this checksum logging limit is getting
hit in the tests but they’re passing..? If this was an intended failure
check then surely there’s a test done that’s intended to be successful and
it should have complained about this file too, or perhaps that’s what’s
missing. I’m happy to look into this later this weekend. Certainly seems
like something here isn’t really making sense tho.

Thanks!

Stephen

In response to

Browse pgsql-committers by date

  From Date Subject
Next Message Michael Paquier 2018-10-20 04:58:23 Re: pgsql: Add TAP tests for pg_verify_checksums
Previous Message Michael Paquier 2018-10-20 04:51:50 Re: pgsql: Add TAP tests for pg_verify_checksums

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-10-20 04:58:23 Re: pgsql: Add TAP tests for pg_verify_checksums
Previous Message Michael Paquier 2018-10-20 04:51:50 Re: pgsql: Add TAP tests for pg_verify_checksums