Re: pg_stat_statements issue with parallel maintenance (Was Re: WAL usage calculation patch)

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_statements issue with parallel maintenance (Was Re: WAL usage calculation patch)
Date: 2020-03-31 14:02:35
Message-ID: CAFiTN-t_mOzS9XLdu7R_3vFb7B+Kw8rzNDQ=cub7f6TTtdPffQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 31, 2020 at 12:20 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Tue, Mar 31, 2020 at 10:44 AM Masahiko Sawada
> <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> >
> > On Tue, 31 Mar 2020 at 12:58, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Mon, Mar 30, 2020 at 12:31 PM Masahiko Sawada
> > > <masahiko(dot)sawada(at)2ndquadrant(dot)com> wrote:
> > > >
> > > > The patch for vacuum conflicts with recent changes in vacuum. So I've
> > > > attached rebased one.
> > > >
> > >
> > > + /*
> > > + * Next, accumulate buffer usage. (This must wait for the workers to
> > > + * finish, or we might get incomplete data.)
> > > + */
> > > + for (i = 0; i < nworkers; i++)
> > > + InstrAccumParallelQuery(&lps->buffer_usage[i]);
> > > +
> > >
> > > This should be done for launched workers aka
> > > lps->pcxt->nworkers_launched. I think a similar problem exists in
> > > create index related patch.
> >
> > You're right. Fixed in the new patches.
> >
> > On Mon, 30 Mar 2020 at 17:00, Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
> > >
> > > Just minor nitpicking:
> > >
> > > + int i;
> > >
> > > Assert(!IsParallelWorker());
> > > Assert(ParallelVacuumIsActive(lps));
> > > @@ -2166,6 +2172,13 @@ lazy_parallel_vacuum_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
> > > /* Wait for all vacuum workers to finish */
> > > WaitForParallelWorkersToFinish(lps->pcxt);
> > >
> > > + /*
> > > + * Next, accumulate buffer usage. (This must wait for the workers to
> > > + * finish, or we might get incomplete data.)
> > > + */
> > > + for (i = 0; i < nworkers; i++)
> > > + InstrAccumParallelQuery(&lps->buffer_usage[i]);
> > >
> > > We now allow declaring a variable in those loops, so it may be better to avoid
> > > declaring i outside the for scope?
> >
> > We can do that but I was not sure if it's good since other codes
> > around there don't use that. So I'd like to leave it for committers.
> > It's a trivial change.
>
> I have reviewed the patch and the patch looks fine to me.
>
> One minor comment
> /+ /* Points to buffer usage are in DSM */
> + BufferUsage *buffer_usage;
> +
> /buffer usage are in DSM / buffer usage area in DSM
>

While testing I have found one issue. Basically, during a parallel
vacuum, it was showing more number of
shared_blk_hits+shared_blks_read. After, some investigation, I found
that during the cleanup phase nworkers are -1, and because of this we
didn't try to launch worker but "lps->pcxt->nworkers_launched" had the
old launched worker count and shared memory also had old buffer read
data which was never updated as we did not try to launch the worker.

diff --git a/src/backend/access/heap/vacuumlazy.c
b/src/backend/access/heap/vacuumlazy.c
index b97b678..5dfaf4d 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -2150,7 +2150,8 @@ lazy_parallel_vacuum_indexes(Relation *Irel,
IndexBulkDeleteResult **stats,
* Next, accumulate buffer usage. (This must wait for the workers to
* finish, or we might get incomplete data.)
*/
- for (i = 0; i < lps->pcxt->nworkers_launched; i++)
+ nworkers = Min(nworkers, lps->pcxt->nworkers_launched);
+ for (i = 0; i < nworkers; i++)
InstrAccumParallelQuery(&lps->buffer_usage[i]);

It worked after the above fix.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2020-03-31 14:04:27 Re: A bug when use get_bit() function for a long bytea string
Previous Message Julien Rouhaud 2020-03-31 14:01:16 Re: WAL usage calculation patch