Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-06-07 06:26:25
Message-ID: CAD21AoD8rVHPYu5iSdAXf3CmrBPakUyquUwzOWtK3K_4t9oZng@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 10, 2019 at 2:19 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Mon, Apr 8, 2019 at 7:25 PM Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >
> > Hello.
> >
> > # Is this still living? I changed the status to "needs review"
> >
> > At Sat, 6 Apr 2019 06:47:32 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoAuD3txrxucnVtM6NGo=JGSjs3VDkoCzN0jGz_egc_82g(at)mail(dot)gmail(dot)com>
> > > > Indeed. How about the following description?
> > > >
> > >
> > > Attached the updated version patches.
> >
> > Thanks.
> >
>
> Thank you for reviewing the patch!
>
> > heapam.h is including access/parallel.h but the file doesn't use
> > parallel.h stuff and storage/shm_toc.h and storage/dsm.h are
> > enough.
>
> Fixed.
>
> >
> > + * DSM keys for parallel lazy vacuum. Since we don't need to worry about DSM
> > + * keys conflicting with plan_node_id we can use small integers.
> >
> > Yeah, this is right, but "plan_node_id" seems abrupt
> > there. Please prepend "differently from parallel execution code"
> > or .. I think no excuse is needed to use that numbers. The
> > executor code is already making an excuse for the large numbers
> > as unusual instead.
>
> Fixed.
>
> >
> > + * Macro to check if we in a parallel lazy vacuum. If true, we're in parallel
> > + * mode and prepared the DSM segments.
> > + */
> > +#define IsInParallelVacuum(lps) (((LVParallelState *) (lps)) != NULL)
> >
> > we *are* in?
>
> Fixed.
>
> >
> > The name "IsInParallleVacuum()" looks (to me) like suggesting
> > "this process is a parallel vacuum worker". How about
> > ParallelVacuumIsActive?
>
> Fixed.
>
> >
> >
> > +typedef struct LVIndStats
> > +typedef struct LVDeadTuples
> > +typedef struct LVShared
> > +typedef struct LVParallelState
> >
> > The names are confusing, and the name LVShared is too
> > generic. Shared-only structs are better to be marked in the name.
> > That is, maybe it would be better that LVIndStats were
> > LVSharedIndStats and LVShared were LVSharedRelStats.
>
> Hmm, LVShared actually stores also various things that are not
> relevant with the relation. I'm not sure that's a good idea to rename
> it to LVSharedRelStats. When we support parallel vacuum for other
> vacuum steps the adding a struct for storing only relation statistics
> might work well.
>
> >
> > It might be better that LVIndStats were moved out from LVShared,
> > but I'm not confident.
> >
> > +static void
> > +lazy_parallel_vacuum_or_cleanup_indexes(LVRelStats *vacrelstats, Relation *Irel
> > ...
> > + lazy_begin_parallel_index_vacuum(lps, vacrelstats, for_cleanup);
> > ...
> > + do_parallel_vacuum_or_cleanup_indexes(Irel, nindexes, stats,
> > + lps->lvshared, vacrelstats->dead_tuples);
> > ...
> > + lazy_end_parallel_index_vacuum(lps, !for_cleanup);
> >
> > The function takes the parameter for_cleanup, but the flag is
> > used by the three subfunctions in utterly ununified way. It seems
> > to me useless to store for_cleanup in lvshared
>
> I think that we need to store for_cleanup or a something telling
> vacuum workers to do either index vacuuming or index cleanup in
> lvshared. Or can we use another thing instead?
>
> > and lazy_end is
> > rather confusing.
>
> Ah, I used "lazy" as prefix of function in vacuumlazy.c. Fixed.
>
> > There's no explanation why "reinitialization"
> > == "!for_cleanup". In the first place,
> > lazy_begin_parallel_index_vacuum and
> > lazy_end_parallel_index_vacuum are called only from the function
> > and rather short so it doesn't seem reasonable that the are
> > independend functions.
>
> Okay agreed, fixed.
>

Since the previous version patch conflicts with current HEAD, I've
attached the updated version patches.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
v25-0001-Add-parallel-option-to-VACUUM-command.patch application/octet-stream 52.4 KB
v25-0002-Add-paralell-P-option-to-vacuumdb-command.patch application/octet-stream 5.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2019-06-07 06:30:35 Re: Small review comment on pg_checksums
Previous Message David Rowley 2019-06-07 05:34:20 Re: Should we warn against using too many partitions?