Re: parallel vacuum options/syntax

From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: parallel vacuum options/syntax
Date: 2020-01-02 13:39:20
Message-ID: CAECtzeUbsb0m2_qTXjwFU2qUDTN3V_2K9=ySU=HPDm1PtGf95g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Le jeu. 2 janv. 2020 à 13:09, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> a
écrit :

> Hi,
>
> I am starting a new thread for some of the decisions for a parallel vacuum
> in the hope to get feedback from more people. There are mainly two points
> for which we need some feedback.
>
> 1. Tomas Vondra has pointed out on the main thread [1] that by default the
> parallel vacuum should be enabled similar to what we do for Create Index.
> As proposed, the patch enables it only when the user specifies it (ex.
> Vacuum (Parallel 2) <tbl_name>;). One of the arguments in favor of
> enabling it by default as mentioned by Tomas is "It's pretty much the same
> thing we did with vacuum throttling - it's disabled for explicit vacuum by
> default, but you can enable it. If you're worried about VACUUM causing
> issues, you should set cost delay.". Some of the arguments against
> enabling it are that it will lead to use of more resources (like CPU, I/O)
> which users might or might like.
>
> Now, if we want to enable it by default, we need a way to disable it as
> well and along with that, we need a way for users to specify a parallel
> degree. I have mentioned a few reasons why we need a parallel degree for
> this operation in the email [2] on the main thread.
>
> If parallel vacuum is **not** enabled by default, then I think the
> current way to enable is fine which is as follows:
> Vacuum (Parallel 2) <tbl_name>;
>
> Here, if the user doesn't specify parallel_degree, then we internally
> decide based on number of indexes that support a parallel vacuum with a
> maximum of max_parallel_maintenance_workers.
>
> If the parallel vacuum is enabled by default, then I could think of the
> following ways:
> (a) Vacuum (disable_parallel) <tbl_name>; Vacuum (Parallel
> <parallel_degree>) <tbl_name>;
> (b) Vacuum (Parallel <parallel_degree>) <tbl_name>; If user specifies
> parallel_degree as 0, then disable parallelism.
> (c) ... Any better ideas?
>
>
AFAICT, every parallel-able statement use parallelisation by default, so it
wouldn't be consistent if VACUUM behaves some other way.

So, (c) has my vote.

2. The patch provides a FAST option (based on suggestion by Robert) for a
> parallel vacuum which will make it behave like vacuum_cost_delay = 0 which
> means it will disable throttling. So,
> VACUUM (PARALLEL n, FAST) <tbl_name> will allow the parallel vacuum to run
> without resource throttling. Tomas thinks that we don't need such an
> option as the same can be served by setting vacuum_cost_delay = 0 which is
> a valid argument, but OTOH, providing an option to the user which can make
> his life easier is not a bad idea either.
>
>
The user already has an option (the vacuum_cost_delay GUC). So I kinda
agree with Tomas on this.

--
Guillaume.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-01-02 13:47:28 Re: remove support for old Python versions
Previous Message Michael Paquier 2020-01-02 13:37:41 Re: pgbench - use pg logging capabilities