Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Mahendra Singh <mahi6run(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, masahiko(dot)sawada(at)2ndquadrant(dot)com
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-11-01 08:51:06
Message-ID: CAD21AoDbqPr=Z3U4NBEEAbMEoFo-LTDYDFcxzdYTg7X7MQZ=RA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 31, 2019 at 3:45 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Thu, Oct 31, 2019 at 11:33 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Tue, Oct 29, 2019 at 1:59 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > Actually after increased shared_buffer I got expected results:
> > >
> > > * Test1 (after increased shared_buffers)
> > > normal : 2807 ms (hit 56295, miss 2, dirty 3, total 56300)
> > > 2 workers : 2840 ms (hit 56295, miss 2, dirty 3, total 56300)
> > > 1 worker : 2841 ms (hit 56295, miss 2, dirty 3, total 56300)
> > >
> > > I updated the patch that computes the total cost delay shared by
> > > Dilip[1] so that it collects the number of buffer hits and so on, and
> > > have attached it. It can be applied on top of my latest patch set[1].
>
> While reading your modified patch (PoC-delay-stats.patch), I have
> noticed that in my patch I used below formulae to compute the total
> delay
> total delay = delay in heap scan + (total delay of index scan
> /nworkers). But, in your patch, I can see that it is just total sum of
> all delay. IMHO, the total sleep time during the index vacuum phase
> must be divided by the number of workers, because even if at some
> point, all the workers go for sleep (e.g. 10 msec) then the delay in
> I/O will be only for 10msec not 30 msec. I think the same is
> discussed upthread[1]
>

I think that two approaches make parallel vacuum worker wait in
different way: in approach(a) the vacuum delay works as if vacuum is
performed by single process, on the other hand in approach(b) the
vacuum delay work for each workers independently.

Suppose that the total number of blocks to vacuum is 10,000 blocks,
the cost per blocks is 10, the cost limit is 200 and sleep time is 5
ms. In single process vacuum the total sleep time is 2,500ms (=
(10,000 * 10 / 200) * 5). The approach (a) is the same, 2,500ms.
Because all parallel vacuum workers use the shared balance value and a
worker sleeps once the balance value exceeds the limit. In
approach(b), since the cost limit is divided evenly the value of each
workers is 40 (e.g. when 5 parallel degree). And suppose each workers
processes blocks evenly, the total sleep time of all workers is
12,500ms (=(2,000 * 10 / 40) * 5 * 5). I think that's why we can
compute the sleep time of approach(b) by dividing the total value by
the number of parallel workers.

IOW the approach(b) makes parallel vacuum delay much more than normal
vacuum and parallel vacuum with approach(a) even with the same
settings. Which behaviors do we expect? I thought the vacuum delay for
parallel vacuum should work as if it's a single process vacuum as we
did for memory usage. I might be missing something. If we prefer
approach(b) I should change the patch so that the leader process
divides the cost limit evenly.

Regards,

--
Masahiko Sawada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Павел Ерёмин 2019-11-01 09:05:12 64 bit transaction id
Previous Message Amit Langote 2019-11-01 08:37:58 Re: Creating foreign key on partitioned table is too slow