Re: new heapcheck contrib module

From: Amul Sul <sulamul(at)gmail(dot)com>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: new heapcheck contrib module
Date: 2020-08-24 09:48:22
Message-ID: CAAJ_b94Wh_7kM9JuJk7wnNsYSNPdSzz35cOEaE_ZRnPcqvupCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Few comments for v14 version:

v14-0001:

verify_heapam.c: In function ‘verify_heapam’:
verify_heapam.c:339:14: warning: variable ‘ph’ set but not used
[-Wunused-but-set-variable]
PageHeader ph;
^
verify_heapam.c: In function ‘check_toast_tuple’:
verify_heapam.c:877:8: warning: variable ‘chunkdata’ set but not used
[-Wunused-but-set-variable]
char *chunkdata;

Got these compilation warnings

+++ b/contrib/amcheck/amcheck.h
@@ -0,0 +1,5 @@
+#include "postgres.h"
+
+Datum verify_heapam(PG_FUNCTION_ARGS);
+Datum bt_index_check(PG_FUNCTION_ARGS);
+Datum bt_index_parent_check(PG_FUNCTION_ARGS);

bt_index_* are needed?

#include "access/htup_details.h"
#include "access/xact.h"
#include "catalog/pg_type.h"
#include "catalog/storage_xlog.h"
#include "storage/smgr.h"
#include "utils/lsyscache.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
#include "utils/syscache.h"

These header file inclusion to verify_heapam.c. can be omitted. Some of those
might be implicitly got added by other header files or no longer need due to
recent changes.

+ * on_error_stop:
+ * Whether to stop at the end of the first page for which errors are
+ * detected. Note that multiple rows may be returned.
+ *
+ * check_toast:
+ * Whether to check each toasted attribute against the toast table to
+ * verify that it can be found there.
+ *
+ * skip:
+ * What kinds of pages in the heap relation should be skipped. Valid
+ * options are "all-visible", "all-frozen", and "none".

I think it would be good if the description also includes what will be default
value otherwise.

+ /*
+ * Optionally open the toast relation, if any, also protected from
+ * concurrent vacuums.
+ */

Now lock is changed to AccessShareLock, I think we need to rephrase this comment
as well since we are not really doing anything extra explicitly to protect from
the concurrent vacuum.

+/*
+ * Return wehter a multitransaction ID is in the cached valid range.
+ */

Typo: s/wehter/whether

v14-0002:

+#define NOPAGER 0

Unused macro.

+ appendPQExpBuffer(querybuf,
+ "SELECT c.relname, v.blkno, v.offnum, v.attnum, v.msg"
+ "\nFROM public.verify_heapam("
+ "\nrelation := %u,"
+ "\non_error_stop := %s,"
+ "\nskip := %s,"
+ "\ncheck_toast := %s,"
+ "\nstartblock := %s,"
+ "\nendblock := %s) v, "
+ "\npg_catalog.pg_class c"
+ "\nWHERE c.oid = %u",
+ tbloid, stop, skip, toast, startblock, endblock, tbloid);
[....]
+ appendPQExpBuffer(querybuf,
+ "SELECT public.bt_index_parent_check('%s'::regclass, %s, %s)",
+ idxoid,
+ settings.heapallindexed ? "true" : "false",
+ settings.rootdescend ? "true" : "false");

The assumption that the amcheck extension will be always installed in the public
schema doesn't seem to be correct. This will not work if amcheck install
somewhere else.

Regards,
Amul

On Thu, Aug 20, 2020 at 5:17 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
>
> On Thu, Aug 20, 2020 at 8:00 AM Mark Dilger
> <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
> >
> >
> >
> > > On Aug 16, 2020, at 9:37 PM, Amul Sul <sulamul(at)gmail(dot)com> wrote:
> > >
> > > In addition to this, I found a few more things while reading v13 patch are as
> > > below:
> > >
> > > Patch v13-0001:
> > >
> > > -
> > > +#include "amcheck.h"
> > >
> > > Not in correct order.
> >
> > Fixed.
> >
> > > +typedef struct BtreeCheckContext
> > > +{
> > > + TupleDesc tupdesc;
> > > + Tuplestorestate *tupstore;
> > > + bool is_corrupt;
> > > + bool on_error_stop;
> > > +} BtreeCheckContext;
> > >
> > > Unnecessary spaces/tabs between } and BtreeCheckContext.
> >
> > This refers to a change in verify_nbtree.c that has been removed. Per discussions with Peter and Robert, I have simply withdrawn that portion of the patch.
> >
> > > static void bt_index_check_internal(Oid indrelid, bool parentcheck,
> > > - bool heapallindexed, bool rootdescend);
> > > + bool heapallindexed, bool rootdescend,
> > > + BtreeCheckContext * ctx);
> > >
> > > Unnecessary space between * and ctx. The same changes needed for other places as
> > > well.
> >
> > Same as above. The changes to verify_nbtree.c have been withdrawn.
> >
> > > ---
> > >
> > > Patch v13-0002:
> > >
> > > +-- partitioned tables (the parent ones) don't have visibility maps
> > > +create table test_partitioned (a int, b text default repeat('x', 5000))
> > > + partition by list (a);
> > > +-- these should all fail
> > > +select * from verify_heapam('test_partitioned',
> > > + on_error_stop := false,
> > > + skip := NULL,
> > > + startblock := NULL,
> > > + endblock := NULL);
> > > +ERROR: "test_partitioned" is not a table, materialized view, or TOAST table
> > > +create table test_partition partition of test_partitioned for values in (1);
> > > +create index test_index on test_partition (a);
> > >
> > > Can't we make it work? If the input is partitioned, I think we could
> > > collect all its leaf partitions and process them one by one. Thoughts?
> >
> > I was following the example from pg_visibility. I haven't thought about your proposal enough to have much opinion as yet, except that if we do this for pg_amcheck we should do likewise to pg_visibility, for consistency of the user interface.
> >
>
> pg_visibility does exist from before the declarative partitioning came
> in, I think it's time to improve that as well.
>
> > > + ctx->chunkno++;
> > >
> > > Instead of incrementing in check_toast_tuple(), I think incrementing should
> > > happen at the caller -- just after check_toast_tuple() call.
> >
> > I agree.
> >
> > > ---
> > >
> > > Patch v13-0003:
> > >
> > > + resetPQExpBuffer(query);
> > > + destroyPQExpBuffer(query);
> > >
> > > resetPQExpBuffer() will be unnecessary if the next call is destroyPQExpBuffer().
> >
> > Thanks. I removed it in cases where destroyPQExpBuffer is obviously the very next call.
> >
> > > + appendPQExpBuffer(query,
> > > + "SELECT c.relname, v.blkno, v.offnum, v.lp_off, "
> > > + "v.lp_flags, v.lp_len, v.attnum, v.chunk, v.msg"
> > > + "\nFROM verify_heapam(rel := %u, on_error_stop := %s, "
> > > + "skip := %s, startblock := %s, endblock := %s) v, "
> > > + "pg_class c"
> > > + "\nWHERE c.oid = %u",
> > > + tbloid, stop, skip, settings.startblock,
> > > + settings.endblock, tbloid
> > >
> > > pg_class should be schema-qualified like elsewhere.
> >
> > Agreed, and changed.
> >
> > > IIUC, pg_class is meant to
> > > get relname only, instead, we could use '%u'::pg_catalog.regclass in the target
> > > list for the relname. Thoughts?
> >
> > get_table_check_list() creates the list of all tables to be checked, which check_tables() then iterates over, calling check_table() for each one. I think some verification that the table still exists is in order. Using '%u'::pg_catalog.regclass for a table that has since been dropped would pass in the old table Oid and draw an error of the 'ERROR: could not open relation with OID 36311' variety, whereas the current coding will just skip the dropped table.
> >
> > > Also I think we should skip '\n' from the query string (see appendPQExpBuffer()
> > > in pg_dump.c)
> >
> > I'm not sure I understand. pg_dump.c uses "\n" in query strings it passes to appendPQExpBuffer(), in a manner very similar to what this patch does.
> >
>
> I see there is a mix of styles, I was referring to dumpDatabase() from pg_dump.c
> which doesn't include '\n'.
>
> > > + appendPQExpBuffer(query,
> > > + "SELECT i.indexrelid"
> > > + "\nFROM pg_catalog.pg_index i, pg_catalog.pg_class c"
> > > + "\nWHERE i.indexrelid = c.oid"
> > > + "\n AND c.relam = %u"
> > > + "\n AND i.indrelid = %u",
> > > + BTREE_AM_OID, tbloid);
> > > +
> > > + ExecuteSqlStatement("RESET search_path");
> > > + res = ExecuteSqlQuery(query->data, PGRES_TUPLES_OK);
> > > + PQclear(ExecuteSqlQueryForSingleRow(ALWAYS_SECURE_SEARCH_PATH_SQL));
> > >
> > > I don't think we need the search_path query. The main query doesn't have any
> > > dependencies on it. Same is in check_indexes(), check_index (),
> > > expand_table_name_patterns() & get_table_check_list().
> > > Correct me if I am missing something.
> >
> > Right.
> >
> > > + output = PageOutput(lines + 2, NULL);
> > > + for (lineno = 0; usage_text[lineno]; lineno++)
> > > + fprintf(output, "%s\n", usage_text[lineno]);
> > > + fprintf(output, "Report bugs to <%s>.\n", PACKAGE_BUGREPORT);
> > > + fprintf(output, "%s home page: <%s>\n", PACKAGE_NAME, PACKAGE_URL);
> > >
> > > I am not sure why we want PageOutput() if the second argument is always going to
> > > be NULL? Can't we directly use printf() instead of PageOutput() + fprintf() ?
> > > e.g. usage() function in pg_basebackup.c.
> >
> > Done.
> >
> >
> > Please find attached the next version of the patch. In addition to your review comments (above), I have made changes in response to Peter and Robert's review comments upthread.
>
> Thanks for the updated version, I'll have a look.
>
> Regards,
> Amul

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2020-08-24 10:36:27 Avoid displaying unnecessary "Recheck Cond" in EXPLAIN ANALYZE output if the bitmap is non-lossy
Previous Message Greg Nancarrow 2020-08-24 09:31:12 Issue with past commit: Allow fractional input values for integer GUCs ...