Re: pg_amcheck contrib application

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, Amul Sul <sulamul(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_amcheck contrib application
Date: 2021-03-04 21:23:42
Message-ID: CA+TgmoZWoXmNjgJn_vB98U_hw+1NVEz1xUZF0Te54i+HiTuxSw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 4, 2021 at 12:27 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> More in a bit, need to grab some lunch.

Moving on to the tests, in 003_check.pl, I think it would be slightly
better if relation_toast were to select ct.oid::regclass and then just
have the caller use that value directly. We'd certainly want to do
that if the name could contain any characters that might require
quoting. Here that's not possible, but I think we might as well use
the same technique anyway.

I'm not sure how far to go with it, but I think that you might want to
try to enhance the logging in some of the cases where the TAP tests
might fail. In particular, if either of these trip in the buildfarm,
it doesn't seem like it will be too easy to figure out why they
failed:

+ fail('Xid thresholds not as expected');
+ fail('Page layout differs from our expectations');

You might want to rephrase the message to incorporate the values that
triggered the failure, e.g. "datfrozenxid $datfrozenxid is not between
3 and $relfrozenxid", "expected (a,b) = (12345678,abcdefg) but got
($x,$y)", so that if the buildfarm happens to fail there's a shred of
hope that we might be able to guess the reason from the message. You
could also give some thought to whether there are any tests that can
be improved in similar ways. Test::More is nice in that when you run a
test with eq() or like() and it fails it will tell you about the input
values in the diagnostic, but if you do something like is($x < 4, ...)
instead of cmp_ok($x, '<', 4, ...) then you lose that. I'm not saying
you're doing that exact thing, just saying that looking through the
test code with an eye to finding things where you could output a
little more info about a potential failure might be a worthwhile
activity.

If it were me, I would get rid of ROWCOUNT and have a list of
closures, and then loop over the list and call each one e.g. my
@corruption = ( sub { ... }, sub { ... }, sub { ... }) or maybe
something like what I did with @scenario in
src/bin/pg_verifybackup/t/003_corruption.pl, but this is ultimately a
style preference and I think the way you actually did it is also
reasonable, and some people might find it more readable than the other
way.

The name int4_fickle_ops is positively delightful and I love having a
test case like this.

On the whole, I think these tests look quite solid. I am a little
concerned, as you may gather from the comment above, that they will
not survive contact with the buildfarm, because they will turn out to
be platform or OS-dependent in some way. However, I can see that
you've taken steps to avoid such dependencies, and maybe we'll be
lucky and those will work. Also, while I am suspicious something's
going to break, I don't know what it's going to be, so I can't suggest
any method to avoid it. I think we'll just have to keep an eye on the
buildfarm post-commit and see what crops up.

Turning to the documentation, I see that it is documented that a bare
command-line argument can be a connection string rather than a
database name. That sounds like a good plan, but when I try
'pg_amcheck sslmode=require' it does not work: FATAL: database
"sslmode=require" does not exist. The argument to -e is also
documented to be a connection string, but that also seems not to work.
Some thought might need to be given to what exactly these connection
opens are supposed to mean. Like, do the connection options I set via
-e apply to all the connections I make, or just the one to the
maintenance database? How do I set connection options for connections
to databases whose names aren't specified explicitly but are
discovered by querying pg_database? Maybe instead of allowing these to
be a connection string, we should have a separate option that can be
used just for the purpose of setting connection options that then
apply to all connections. That seems a little bit oddly unlike other
tools, but if I want sslmode=verify-ca or something on all my
connections, there should be an easy way to get it.

The documentation makes many references to patterns, but does not
explain what a pattern is. I see that psql's documentation contains an
explanation, and pg_dump's documentation links to psql's
documentation. pg_amcheck should probably link to psql's
documentation, too.

In the documentation for -d, you say that "If -a --all is also
specified, -d --database does not additionally affect which databases
are checked." I suggest replacing "does not additionally affect which
databases are checked" with "has no effect."

In two places you say "without regard for" but I think it should be
"without regard to".

In the documentation for --no-strict-names you use "nor" where I think
it should say "or".

I kind of wonder whether we need --quiet. It seems like right now it
only does two things. One is to control complaints about ignoring the
startblock and endblock options, but I don't agree with that behavior
anyway. The other is control whether we complain about unmatched
patterns, but I think that could just be controlled --no-strict-names
i.e. normally an unmatched pattern results in a complaint and a
failure, but with --no-strict-names there is neither a complaint nor a
failure. Having a flag to control whether we get the message
separately from whether we get the failure doesn't seem helpful.

I don't think it's good to say "This is an alias for" in the
documentation of -i -I -t -T. I suggest instead saying "This is
similar to".

Instead of "Option BLAH takes precedence over..." I suggest "The BLAH
option takes precedence over..."

OK, that's it from me for this review pass.

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-03-04 21:28:49 Re: [PATCH] postgres-fdw: column option to override foreign types
Previous Message Andrew Dunstan 2021-03-04 21:17:14 Re: Removing support for COPY FROM STDIN in protocol version 2