Re: new heapcheck contrib module

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Geoghegan <pg(at)bowt(dot)ie>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>, Amul Sul <sulamul(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: new heapcheck contrib module
Date: 2021-03-02 20:39:29
Message-ID: CA+TgmoZfPZw+-gJERHojKPTSMKVhuEFJeRhw0Yg7w_sT2Nqm6w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 2, 2021 at 1:24 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Other than that 0001 looks to me to be in pretty good shape now.

Incidentally, we might want to move this to a new thread with a better
subject line, since the current subject line really doesn't describe
the uncommitted portion of the work. And create a new CF entry, too.

Moving onto 0002:

The index checking options should really be called btree index
checking options. I think I'd put the table options first, and the
btree options second. Other kinds of indexes could follow some day. I
would personally omit the short forms of --heapallindexed and
--parent-check; I think we'll run out of option names too quickly if
people add more kinds of checks.

Perhaps VerifyBtreeSlotHandler should emit a warning of some kind if
PQntuples(res) != 0.

+ /*
+ * Test that this function works, but for now we're
not using the list
+ * 'relations' that it builds.
+ */
+ conn = connectDatabase(&cparams, progname, opts.echo,
false, true);

This comment appears to have nothing to do with the code, since
connectDatabase() does not build a list of 'relations'.

amcheck_sql seems to include paranoia, but do we need that if we're
using a secure search path? Similarly for other SQL queries, e.g. in
prepare_table_command.

It might not be strictly necessary for the static functions in
pg_amcheck.c to use_three completelyDifferent NamingConventions for
its static functions.

should_processing_continue() is one semicolon over budget.

The initializer for opts puts a comma even after the last member
initializer. Is that going to be portable to all compilers?

+ for (failed = false, cell = opts.include.head; cell; cell = cell->next)

I think failed has to be false here, because it gets initialized at
the top of the function. If we need to reinitialize it for some
reason, I would prefer you do that on the previous line, separate from
the for loop stuff.

+ char *dbrgx; /* Database regexp parsed from pattern, or
+ * NULL */
+ char *nsprgx; /* Schema regexp parsed from pattern, or NULL */
+ char *relrgx; /* Relation regexp parsed from pattern, or
+ * NULL */
+ bool tblonly; /* true if relrgx should only match tables */
+ bool idxonly; /* true if relrgx should only match indexes */

Maybe: db_regex, nsp_regex, rel_regex, table_only, index_only?

Just because it seems theoretically possible that someone will see
nsprgx and not immediately understand what it's supposed to mean, even
if they know that nsp is a common abbreviation for namespace in
PostgreSQL code, and even if they also know what a regular expression
is.

Your four messages about there being nothing to check seem like they
could be consolidated down to one: "nothing to check for pattern
\"%s\"".

I would favor changing things so that once argument parsing is
complete, we switch to reporting all errors that way. So in other
words here, and everything that follows:

+ fprintf(stderr, "%s: no databases to check\n", progname);

+ * ParallelSlots based event loop follows.

"Main event loop."

To me it would read slightly better to change each reference to
"relations list" to "list of relations", but perhaps that is too
nitpicky.

I think the two instances of goto finish could be avoided with not
much work. At most a few things need to happen only if !failed, and
maybe not even that, if you just said "break;" instead.

+ * Note: Heap relation corruption is returned by verify_heapam() without the
+ * use of raising errors, but running verify_heapam() on a corrupted table may

How about "Heap relation corruption() is reported by verify_heapam()
via the result set, rather than an ERROR, ..."

It seems mighty inefficient to have a whole bunch of consecutive calls
to remove_relation_file() or corrupt_first_page() when every such call
stops and restarts the database. I would guess these tests will run
noticeably faster if you don't do that. Either the functions need to
take a list of arguments, or the stop/start needs to be pulled up and
done in the caller.

corrupt_first_page() could use a comment explaining what exactly we're
overwriting, and in particular noting that we don't want to just
clobber the LSN, but rather something where we can detect a wrong
value.

There's a long list of calls to command_checks_all() in 003_check.pl
that don't actually check anything but that the command failed, but
run it with a bunch of different options. I don't understand the value
of that, and suggest reducing the number of cases tested. If you want,
you can have tests elsewhere that focus -- perhaps by using verbose
mode -- on checking that the right tables are being checked.

This is not yet a full review of everything in this patch -- I haven't
sorted through all of the tests yet, or all of the new query
construction logic -- but to me this looks pretty close to
committable.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2021-03-02 20:40:17 Re: [PATCH] Support empty ranges with bounds information
Previous Message Joel Jacobson 2021-03-02 20:26:50 Re: [PATCH] Support empty ranges with bounds information