Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-04-10 05:19:59
Message-ID: CAD21AoCX3fmXJrPUWd9H2_iMMtdh32XN7qRx1udWD7aeD0=G_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 8, 2019 at 7:25 PM Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>
> Hello.
>
> # Is this still living? I changed the status to "needs review"
>
> At Sat, 6 Apr 2019 06:47:32 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoAuD3txrxucnVtM6NGo=JGSjs3VDkoCzN0jGz_egc_82g(at)mail(dot)gmail(dot)com>
> > > Indeed. How about the following description?
> > >
> >
> > Attached the updated version patches.
>
> Thanks.
>

Thank you for reviewing the patch!

> heapam.h is including access/parallel.h but the file doesn't use
> parallel.h stuff and storage/shm_toc.h and storage/dsm.h are
> enough.

Fixed.

>
> + * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
> + * keys conflicting with plan_node_id we can use small integers.
>
> Yeah, this is right, but "plan_node_id" seems abrupt
> there. Please prepend "differently from parallel execution code"
> or .. I think no excuse is needed to use that numbers. The
> executor code is already making an excuse for the large numbers
> as unusual instead.

Fixed.

>
> + * Macro to check if we in a parallel lazy vacuum. If true, we're in parallel
> + * mode and prepared the DSM segments.
> + */
> +#define IsInParallelVacuum(lps) (((LVParallelState *) (lps)) != NULL)
>
> we *are* in?

Fixed.

>
> The name "IsInParallleVacuum()" looks (to me) like suggesting
> "this process is a parallel vacuum worker". How about
> ParallelVacuumIsActive?

Fixed.

>
>
> +typedef struct LVIndStats
> +typedef struct LVDeadTuples
> +typedef struct LVShared
> +typedef struct LVParallelState
>
> The names are confusing, and the name LVShared is too
> generic. Shared-only structs are better to be marked in the name.
> That is, maybe it would be better that LVIndStats were
> LVSharedIndStats and LVShared were LVSharedRelStats.

Hmm, LVShared actually stores also various things that are not
relevant with the relation. I'm not sure that's a good idea to rename
it to LVSharedRelStats. When we support parallel vacuum for other
vacuum steps the adding a struct for storing only relation statistics
might work well.

>
> It might be better that LVIndStats were moved out from LVShared,
> but I'm not confident.
>
> +static void
> +lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel
> ...
> + lazy_begin_parallel_index_vacuum(lps, vacrelstats, for_cleanup);
> ...
> + do_parallel_vacuum_or_cleanup_indexes(Irel, nindexes, stats,
> + lps->lvshared, vacrelstats->dead_tuples);
> ...
> + lazy_end_parallel_index_vacuum(lps, !for_cleanup);
>
> The function takes the parameter for_cleanup, but the flag is
> used by the three subfunctions in utterly ununified way. It seems
> to me useless to store for_cleanup in lvshared

I think that we need to store for_cleanup or a something telling
vacuum workers to do either index vacuuming or index cleanup in
lvshared. Or can we use another thing instead?

> and lazy_end is
> rather confusing.

Ah, I used "lazy" as prefix of function in vacuumlazy.c. Fixed.

> There's no explanation why "reinitialization"
> == "!for_cleanup". In the first place,
> lazy_begin_parallel_index_vacuum and
> lazy_end_parallel_index_vacuum are called only from the function
> and rather short so it doesn't seem reasonable that the are
> independend functions.

Okay agreed, fixed.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2019-04-10 05:32:02 Re: Transaction commits VS Transaction commits (with parallel) VS query mean time
Previous Message Tom Lane 2019-04-10 04:57:24 Re: Dependences records and comments