Re: [Proposal] Allow users to specify multiple tables in VACUUM commands

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: "Bossart, Nathan" <bossartn(at)amazon(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: [Proposal] Allow users to specify multiple tables in VACUUM commands
Date: 2017-08-30 22:37:09
Message-ID: CAB7nPqRQhSFP+ENJPd2xi6OX5aZ8u1Q-Y2ZW-P+4t=TFxrC+AQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 30, 2017 at 1:47 AM, Bossart, Nathan <bossartn(at)amazon(dot)com> wrote:
> On 8/28/17, 11:26 PM, "Michael Paquier" <michael(dot)paquier(at)gmail(dot)com> wrote:
>> About the de-duplication patch, I have to admit that I am still not a
>> fan of doing such a thing. Another road that we could take is to
>> simply complain with a proper error message if:
>> - the same column name is specified twice for a relation.
>> - the same relation is defined twice. In the case of partitions, we
>> could track the fact that it is already listed as part of a parent,
>> though perhaps it does not seem worth the extra CPU cost especially
>> when there are multiple nesting levels with partitions.
>> Autovacuum has also the advantage, if I recall correctly, to select
>> all columns for analyze, and skip parent partitions when scanning for
>> relations so that's a safe bet from this side. Opinions welcome.
>
> I lean towards favoring the de-duplication patch, but maybe I am biased
> as the author.

I can be biased as reviewer then.

> I can see the following advantages:
>
> 1. Ease of use. By taking care of de-duplicating on behalf of the user,
> they needn't worry about inheritance structures or accidentally
> specifying the same relation or column twice. This might be especially
> useful if a large number of relations or columns must be specified.
> 2. Resource conservation. By de-duplicating, VACUUM and ANALYZE are
> doing roughly the same thing but with less work.
> 3. The obnoxious errors you were experiencing are resolved. This seems
> like the strongest argument to me, as it fixes an existing issue.
>
> Disadvantages might include:
>
> 1. Users cannot schedule repeated VACUUMs on the same relation (e.g.
> 'VACUUM table, table, table;'). However, I cannot think of a time when
> I needed this, and it seems like something else is wrong with VACUUM if
> folks are resorting to this. In the end, you could still achieve this
> via several VACUUM statements.
> 2. Any inferred ordering for how the relations are processed will not
> be accurate if there are duplicates. Ultimately, users might lose some
> amount of control here, but I am not sure how prevalent this use case
> might be. In the worst case, you could achieve this via several
> individual VACUUM statements as well.
>
> Your suggestion to ERROR seems like a reasonable compromise, but I
> could see it causing frustration in some cases, especially with
> partitioning.

Yeah... Each approach has its cost and its advantages. It may be
better to wait for more opinions, no many people have complained yet
that for example a list of columns using twice the same one fails.

+VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ <replaceable
class="PARAMETER">table_name</replaceable> ] [, ...]
I just noticed that... But regarding the docs, I think that you have
misplaced the position of "[, ...]", which should be inside the
table_name portion in the case of what I quote here, no?

+VacuumRelation *
+makeVacuumRelation(RangeVar *relation, List *va_cols, Oid oid)
+{
+ VacuumRelation *vacrel = makeNode(VacuumRelation);
+ vacrel->relation = relation;
+ vacrel->va_cols = va_cols;
+ vacrel->oid = oid;
+ return vacrel;
+}
Perhaps in makefuncs.c instead of vacuum.c? That's usually the place
used for node constructions like that.

Those are minor tweaks, I'll be fine to move that as ready for
committer after for those points are addressed for the main patch.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bossart, Nathan 2017-08-30 23:08:14 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands
Previous Message Michael Paquier 2017-08-30 22:16:59 Re: Re: [COMMITTERS] pgsql: pg_rewind: Fix some problems when copying files >2GB.