| From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> | 
|---|---|
| To: | Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com> | 
| Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Sergei Kornilov <sk(at)zsrv(dot)org>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <langote_amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: [HACKERS] Block level parallel vacuum | 
| Date: | 2019-12-29 21:23:54 | 
| Message-ID: | 20191229212354.tqivttn23lxjg2jz@development | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Sun, Dec 29, 2019 at 10:06:23PM +0900, Masahiko Sawada wrote:
>On Fri, 27 Dec 2019 at 11:24, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>>
>> Hi,
>>
>> On Wed, Dec 25, 2019 at 09:17:16PM +0900, Masahiko Sawada wrote:
>> >On Tue, 24 Dec 2019 at 15:46, Masahiko Sawada
>> ><masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>> >>
>> >> On Tue, 24 Dec 2019 at 15:44, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> >> >
>> >> > On Tue, Dec 24, 2019 at 12:08 PM Masahiko Sawada
>> >> > <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
>> >> > >
>> >> > >
>> >> > > The first patches look good to me. I'm reviewing other patches and
>> >> > > will post comments if there is.
>> >> > >
>> >>
>> >> Oops I meant first "two" patches look good to me.
>> >>
>> >> >
>> >> > Okay, feel free to address few comments raised by Mahendra along with
>> >> > whatever you find.
>> >>
>> >> Thanks!
>> >>
>> >
>> >I've attached updated patch set as the previous version patch set
>> >conflicts to the current HEAD. This patch set incorporated the review
>> >comments, a few fix and the patch for
>> >PARALLEL_VACUUM_DISABLE_LEADER_PARTICIPATION. 0001 patch is the same
>> >as previous version.
>> >
>>
>> I've been reviewing the updated patches over the past couple of days, so
>> let me share some initial review comments. I initially started to read
>> the thread, but then I realized it's futile - the thread is massive, and
>> the patch changed so much re-reading the whole thread is a waste of time.
>
>Thank you for reviewing this patch!
>
>>
>> It might be useful write a summary of the current design, but AFAICS the
>> original plan to parallelize the heap scan is abandoned and we now do
>> just the steps that vacuum indexes in parallel. Which is fine, but it
>> means the subject "block level parallel vacuum" is a bit misleading.
>>
>
>Yeah I should have renamed it. I'll summarize the current design.
>
OK
>> Anyway, most of the logic is implemented in part 0002, which actually
>> does all the parallel worker stuff. The remaining parts 0001, 0003 and
>> 0004 are either preparing infrastructure or not directlyrelated to the
>> primary feature.
>>
>>
>> v40-0001-Introduce-IndexAM-fields-for-parallel-vacuum.patch
>> -----------------------------------------------------------
>>
>> I wonder if 'amparallelvacuumoptions' is unnecessarily specific. Maybe
>> it should be called just 'amvacuumoptions' or something like that? The
>> 'parallel' part is actually encoded in names of the options.
>>
>
>amvacuumoptions seems good to me.
>
>> Also, why do we need a separate amusemaintenanceworkmem option? Why
>> don't we simply track it using a separate flag in 'amvacuumoptions'
>> (or whatever we end up calling it)?
>>
>
>It also seems like a good idea.
>
I think there's another question we need to ask - why to we introduce a
bitmask, instead of using regular boolean struct members? Until now, the
IndexAmRoutine struct had simple boolean members describing capabilities
of the AM implementation. Why shouldn't this patch do the same thing,
i.e. add one boolean flag for each AM feature?
>> Would it make sense to track m_w_m usage separately for the two index
>> cleanup phases? Or is that unnecessary / pointless?
>
>We could do that but currently index AM uses this option is only gin
>indexes. And gin indexes could use maintenance_work_mem both during
>bulkdelete and cleanup. So it might be unnecessary at least as of now.
>
OK
>>
>>
>> v40-0002-Add-a-parallel-option-to-the-VACUUM-command.patch
>> ----------------------------------------------------------
>>
>> I haven't found any issues yet, but I've only started with the code
>> review. I'll continue with the review. It seems in a fairly good shape
>> though, I think, I only have two minor comments at the moment:
>>
>> - The SizeOfLVDeadTuples macro seems rather weird. It does include space
>>    for one ItemPointerData, but we really need an array of them. But then
>>    all the places where the macro is used explicitly add space for the
>>    pointers, so the sizeof(ItemPointerData) seems unnecessary. So it
>>    should be either
>>
>> #define SizeOfLVDeadTuples (offsetof(LVDeadTuples, itemptrs))
>>
>>    or
>>
>> #define SizeOfLVDeadTuples(cnt) \
>>    (offsetof(LVDeadTuples, itemptrs) + (cnt) * sizeof(ItemPointerData))
>>
>>    in which case the callers can be simplified.
>
>Fixed it to the former.
>
Hmmm, I'd actually suggest to use the latter variant, because it allows
simplifying the callers. Just translating it to offsetof() is not saving
much code, I think.
>>
>> - It's not quite clear to me why we need the new nworkers_to_launch
>>    field in ParallelContext.
>
>The motivation of nworkers_to_launch is to specify the number of
>workers to actually launch when we use the same parallel context
>several times while changing the number of workers to launch. Since
>index AM can choose the participation of bulkdelete and/or cleanup,
>the number of workers required for each vacuum phrases can be
>different. I originally changed LaunchParallelWorkers to have the
>number of workers to launch so that it launches different number of
>workers for each vacuum phases but Robert suggested to change the
>routine of reinitializing parallel context[1]. It would be less
>confusing and would involve modify code in a lot fewer places. So with
>this patch we specify the number of workers during initializing the
>parallel context as a maximum number of workers. And using
>ReinitializeParallelWorkers before doing either bulkdelete or cleanup
>we specify the number of workers to launch.
>
Hmmm. I find it a bit confusing, but I don't know a better solution.
>>
>>
>> v40-0003-Add-FAST-option-to-vacuum-command.patch
>> ------------------------------------------------
>>
>> I do have a bit of an issue with this part - I'm not quite convinved we
>> actually need a FAST option, and I actually suspect we'll come to regret
>> it sooner than later. AFAIK it pretty much does exactly the same thing
>> as setting vacuum_cost_delay to 0, and IMO it's confusing to provide
>> multiple ways to do the same thing - I do expect reports from confused
>> users on pgsql-bugs etc. Why is setting vacuum_cost_delay directly not a
>> sufficient solution?
>
>I think the motivation of this option is similar to FREEZE. I think
>it's sometimes a good idea to have a shortcut of popular usage and
>make it have an name corresponding to its job. From that perspective I
>think having FAST option would make sense but maybe we need more
>discussion the combination parallel vacuum and vacuum delay.
>
OK. I think it's mostly independent piece, so maybe we should move it to
a separate patch. It's more likely to get attention/feedback when not
buried in this thread.
>>
>> The same thing applies to the PARALLEL flag, added in 0002, BTW. Why do
>> we need a separate VACUUM option, instead of just using the existing
>> max_parallel_maintenance_workers GUC? It's good enough for CREATE INDEX
>> so why not here?
>
>AFAIR There was no such discussion so far but I think one reason could
>be that parallel vacuum should be disabled by default. If the parallel
>vacuum uses max_parallel_maintenance_workers (2 by default) rather
>than having the option the parallel vacuum would work with default
>setting but I think that it would become a big impact for user because
>the disk access could become random reads and writes when some indexes
>are on the same tablespace.
>
I'm not quite convinced VACUUM should have parallelism disabled by
default. I know some people argued we should do that because making
vacuum faster may put pressure on other parts of the system. Which is
true, but I don't think the solution is to make vacuum slower by
default. IMHO we should do the opposite - have it parallel by default
(as driven by max_parallel_maintenance_workers), and have an option
to disable parallelism.
It's pretty much the same thing we did with vacuum throttling - it's
disabled for explicit vacuum by default, but you can enable it. If
you're worried about VACUUM causing issues, you should cost delay.
The way it's done now we pretty much don't handle either case without
having to tweak something:
- If you really want to go as fast as possible (e.g. during maintenance
   window) you have to say "PARALLEL".
- If you need to restrict VACUUM activity, you have to et cost_delay
   because just not using parallelism seems unreliable.
Of course, the question is what to do about autovacuum - I agree it may
make sense to have parallelism disabled in this case (just like we
already have throttling enabled by default for autovacuum).
>>
>> Maybe it's explained somewhere deep in the thread, of course ...
>>
>>
>> v40-0004-Add-ability-to-disable-leader-participation-in-p.patch
>> ---------------------------------------------------------------
>>
>> IMHO this should be simply merged into 0002.
>
>We discussed it's still unclear whether we really want to commit this
>code and therefore it's separated from the main part. Please see more
>details here[2].
>
IMO there's not much reason for the leader not to participate. For
regular queries the leader may be doing useful stuff (essentially
running the non-parallel part of the query) but AFAIK for VAUCUM that's
not the case and the worker is not doing anything.
>I've fixed code based on the review comments and rebased to the
>current HEAD. Some comments around vacuum option name and FAST option
>are still left as we would need more discussion.
>
Thanks, I'll take a look.
regards
>--
>Masahiko Sawada            http://www.2ndQuadrant.com/
>PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Vik Fearing | 2019-12-29 22:10:12 | Re: Recognizing superuser in pg_hba.conf | 
| Previous Message | Justin Pryzby | 2019-12-29 20:17:47 | Re: error context for vacuum to include block number (atomic progress update) |