Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Masahiko Sawada" <sawada(dot)mshk(at)gmail(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: [Proposal] Allow users to specify multiple tables in VACUUM commands
Date: 2017-09-20 23:18:08
Message-ID: 7232.1505949488@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Bossart, Nathan" <bossartn(at)amazon(dot)com> writes:
> On 9/20/17, 2:29 PM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I think that the way this ought to work is you process the VacuumRelation
>> structs purely sequentially, each in its own transaction, so you don't
>> need leaps of faith about what to do if a relation changes between the
>> first time you look at it and when you actually come to process it.

> How might this work for VACUUM statements with no tables specified? It
> seems like we must either handle the case when the relation changes before
> it is processed or hold a lock for the duration of the vacuuming.

I don't see a need to change the way that vacuum-with-no-table operates.
It essentially collects a list of relation OIDs that exist at the start
of vacuuming, and then vacuums all the ones that still exist when their
turn comes. Since no information beyond the bare OID is saved across the
preceding transactions, there's not really any schema-change risk involved.

We could possibly adapt that concept to the inheritance/partitioning cases
for vacuum with a table name or list of names: when we first come to a
VacuumRelation, collect a list of child table OIDs, and then process each
one unless it's disappeared by the time its turn comes. But in any case,
we should not be doing any checks on a particular relation until we've got
it open and locked with intent to vacuum.

>> The idea of "fast failing" because some later VacuumRelation struct might
>> contain an error seems like useless complication, both in terms of the
>> implementation and the user-visible behavior.

> I agree that the patch might be simpler without this, but the user-visible
> behavior is the reason I had included it. In short, my goal was to avoid
> errors halfway through a long-running VACUUM statement because the user
> misspelled a relation/column name or the relation/column was dropped.

I don't particularly buy that argument, because it's not the case that
the preceding processing was wasted when that happens. We've done and
committed the vacuuming work for the earlier relations.

> It's true that the tests become mostly pointless if the relations or
> columns change before they are processed, but this seems like it would be
> a rarer issue in typical use cases.

Nonetheless, we'd have to explain this behavior to people, and I think
it's mostly useless complication. With what I'm proposing, if vacuum
complains about the third table in the list, you know it has done the
ones before that. What what you want to do, maybe it did the ones
before that, or maybe it didn't.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-09-20 23:20:40 Re: compress method for spgist - 2
Previous Message Peter Geoghegan 2017-09-20 23:08:27 Re: CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?