Re: A few new options for vacuumdb

From: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: A few new options for vacuumdb
Date: 2019-01-22 23:21:32
Message-ID: F716EB06-0CBE-4342-9CFA-23409319B52A@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/21/19, 10:08 PM, "Michael Paquier" <michael(at)paquier(dot)xyz> wrote:
> On Mon, Jan 21, 2019 at 10:09:09PM +0000, Bossart, Nathan wrote:
>> Here's a new patch set that should address the feedback in this
>> thread. The changes in this version include:
>>
>> - 0001 is a small fix to the 'vacuumdb --disable-page-skipping'
>> documentation. My suggestion is to keep it short and simple like
>> --full, --freeze, --skip-locked, --verbose, and --analyze. The
>> DISABLE_PAGE_SKIPPING option is well-described in the VACUUM
>> documentation, and IMO it is reasonably obvious that such vacuumdb
>> options correspond to the VACUUM options.
>
> Okay, committed this one.

Thanks.

> Regarding the search_path business, there is actually similar business
> in expand_table_name_patterns() for pg_dump if you look close by, so
> as users calling --table don't have to specify the schema, and just
> stick with patterns.

I see. The "WHERE c.relkind..." clause looks a bit different than
what I have, so I've altered vacuumdb's catalog query to match. This
was changed in expand_table_name_patterns() as part of 582edc369cd for
a security fix, so we should probably do something similar here.

> + /*
> + * Prepare the list of tables to process by querying the catalogs.
> + *
> + * Since we execute the constructed query with the default search_path
> + * (which could be unsafe), it is important to fully qualify everything.
> + */
> It is not only important, but also absolutely mandatory, so let's make
> the comment insisting harder on that point. From this point of view,
> the query that you are building is visibly correct.

I've updated the comment as suggested.

> + appendStringLiteralConn(&catalog_query, just_table, conn);
> + appendPQExpBuffer(&catalog_query, "::pg_catalog.regclass\n");
> +
> + pg_free(just_table);
> +
> + cell = cell->next;
> + if (cell == NULL)
> + appendPQExpBuffer(&catalog_query, " )\n");
> Hm. It seems to me that you are duplicating what
> processSQLNamePattern() does, so we ought to use it. And it is
> possible to control the generation of the different WHERE clauses with
> a single query based on the number of elements in the list. Perhaps I
> am missing something? It looks unnecessary to export
> split_table_columns_spec() as well.

I'm sorry, I don't quite follow this. AFAICT processSQLNamePattern()
is for generating WHERE clauses based on a wildcard-pattern string for
psql and pg_dump. This portion of the patch is generating a WHERE
clause to filter only for the tables listed via --table, and I don't
think the --table option currently accepts patterns. Since you can
also provide a list of columns with the --table option, I am using
split_table_columns_spec() to retrieve only the table name for the OID
lookup.

> - qr/statement: ANALYZE;/,
> + qr/statement: ANALYZE pg_catalog\./,
> Or we could just use "ANALYZE \.;/", perhaps patching it first.
> Perhaps my regexp here is incorrect, but I just mean to check for an
> ANALYZE query which begins by "ANALYZE " and finishes with ";",
> without caring about what is in-between. This would make the tests
> more portable.

Sure, I've split this out into a separate patch.

> +$node->issues_sql_like(
> + [ 'vacuumdb', '--table=pg_class', '--min-mxid-age=123456789',
> 'postgres'],
> + qr/AND GREATEST\(pg_catalog.mxid_age\(c.relminmxid\),
> pg_catalog.mxid_age\(t.relminmxid\)\) OPERATOR\(pg_catalog.>=\) 123456789/,
> + 'vacuumdb --table --min-mxid-age');
> +$node->issues_sql_like(
> + [ 'vacuumdb', '--min-xid-age=987654321', 'postgres' ],
> + qr/AND GREATEST\(pg_catalog.age\(c.relfrozenxid\),
> pg_catalog.age\(t.relfrozenxid\)\) OPERATOR\(pg_catalog.>=\) 987654321/,
> + 'vacuumdb --min-xid-age');
> It may be better to use numbers which make sure that no relations are
> actually fetched, so as if the surrounding tests are messed up we
> never make them longer than necessary just to check the shape of the
> query.

I think it's already pretty unlikely that any matching relations would
be found, but I've updated these values to be within the range where
any matching database has already stopped accepting commands to
prevent wraparound.

Nathan

Attachment Content-Type Size
v4-0001-Adjust-vacuumdb-test-regex-in-preparation-for-alw.patch application/octet-stream 2.1 KB
v4-0002-Always-use-a-catalog-query-to-discover-tables-to-.patch application/octet-stream 8.6 KB
v4-0003-Add-min-xid-age-and-min-mxid-age-options-to-vacuu.patch application/octet-stream 6.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-01-22 23:46:07 Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Previous Message maayan mordehai 2019-01-22 23:03:02 postgres on a non-journaling filesystem