Re: cost based vacuum (parallel)

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: cost based vacuum (parallel)
Date: 2019-11-11 07:29:09
Message-ID: CAFiTN-tC=NcvcEd+5J62fR8-D8x7EHuVi2xhS-0DMf1bnJs4hw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 11, 2019 at 9:43 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Fri, Nov 8, 2019 at 11:49 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Fri, Nov 8, 2019 at 9:39 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > >
> > > I have done some experiments on this line. I have first produced a
> > > case where we can show the problem with the existing shared costing
> > > patch (worker which is doing less I/O might pay the penalty on behalf
> > > of the worker who is doing more I/O). I have also hacked the shared
> > > costing patch of Swada-san so that worker only go for sleep if the
> > > shared balance has crossed the limit and it's local balance has
> > > crossed some threadshold[1].
> > >
> > > Test setup: I have created 4 indexes on the table. Out of which 3
> > > indexes will have a lot of pages to process but need to dirty a few
> > > pages whereas the 4th index will have to process a very less number of
> > > pages but need to dirty all of them. I have attached the test script
> > > along with the mail. I have shown what is the delay time each worker
> > > have done. What is total I/O[1] each worker and what is the page hit,
> > > page miss and page dirty count?
> > > [1] total I/O = _nhit * VacuumCostPageHit + _nmiss *
> > > VacuumCostPageMiss + _ndirty * VacuumCostPageDirty
> > >
> > > patch 1: Shared costing patch: (delay condition ->
> > > VacuumSharedCostBalance > VacuumCostLimit)
> > > worker 0 delay=80.00 total I/O=17931 hit=17891 miss=0 dirty=2
> > > worker 1 delay=40.00 total I/O=17931 hit=17891 miss=0 dirty=2
> > > worker 2 delay=110.00 total I/O=17931 hit=17891 miss=0 dirty=2
> > > worker 3 delay=120.98 total I/O=16378 hit=4318 miss=0 dirty=603
> > >
> > > Observation1: I think here it's clearly visible that worker 3 is
> > > doing the least total I/O but delaying for maximum amount of time.
> > > OTOH, worker 1 is delaying for very little time compared to how much
> > > I/O it is doing. So for solving this problem, I have add a small
> > > tweak to the patch. Wherein the worker will only sleep if its local
> > > balance has crossed some threshold. And, we can see that with that
> > > change the problem is solved up to quite an extent.
> > >
> > > patch 2: Shared costing patch: (delay condition ->
> > > VacuumSharedCostBalance > VacuumCostLimit && VacuumLocalBalance >
> > > VacuumCostLimit/number of workers)
> > > worker 0 delay=100.12 total I/O=17931 hit=17891 miss=0 dirty=2
> > > worker 1 delay=90.00 total I/O=17931 hit=17891 miss=0 dirty=2
> > > worker 2 delay=80.06 total I/O=17931 hit=17891 miss=0 dirty=2
> > > worker 3 delay=80.72 total I/O=16378 hit=4318 miss=0 dirty=603
> > >
> > > Observation2: This patch solves the problem discussed with patch1 but
> > > in some extreme cases there is a possibility that the shared limit can
> > > become twice as much as local limit and still no worker goes for the
> > > delay. For solving that there could be multiple ideas a) Set the max
> > > limit on shared balance e.g. 1.5 * VacuumCostLimit after that we will
> > > give the delay whoever tries to do the I/O irrespective of its local
> > > balance.
> > > b) Set a little lower value for the local threshold e.g 50% of the local limit
> > >
> > > Here I have changed the patch2 as per (b) If local balance reaches to
> > > 50% of the local limit and shared balance hit the vacuum cost limit
> > > then go for the delay.
> > >
> > > patch 3: Shared costing patch: (delay condition ->
> > > VacuumSharedCostBalance > VacuumCostLimit && VacuumLocalBalance > 0.5
> > > * VacuumCostLimit/number of workers)
> > > worker 0 delay=70.03 total I/O=17931 hit=17891 miss=0 dirty=2
> > > worker 1 delay=100.14 total I/O=17931 hit=17891 miss=0 dirty=2
> > > worker 2 delay=80.01 total I/O=17931 hit=17891 miss=0 dirty=2
> > > worker 3 delay=101.03 total I/O=16378 hit=4318 miss=0 dirty=603
> > >
> > > Observation3: I think patch3 doesn't completely solve the issue
> > > discussed in patch1 but its far better than patch1.
> > >
> >
> > Yeah, I think it is difficult to get the exact balance, but we can try
> > to be as close as possible. We can try to play with the threshold and
> > another possibility is to try to sleep in proportion to the amount of
> > I/O done by the worker.
> I have done another experiment where I have done another 2 changes on
> top op patch3
> a) Only reduce the local balance from the total shared balance
> whenever it's applying delay
> b) Compute the delay based on the local balance.
>
> patch4:
> worker 0 delay=84.130000 total I/O=17931 hit=17891 miss=0 dirty=2
> worker 1 delay=89.230000 total I/O=17931 hit=17891 miss=0 dirty=2
> worker 2 delay=88.680000 total I/O=17931 hit=17891 miss=0 dirty=2
> worker 3 delay=80.790000 total I/O=16378 hit=4318 miss=0 dirty=603
>
> I think with this approach the delay is divided among the worker quite
> well compared to other approaches
>
> >
> > Thanks for doing these experiments, but I think it is better if you
> > can share the modified patches so that others can also reproduce what
> > you are seeing. There is no need to post the entire parallel vacuum
> > patch-set, but the costing related patch can be posted with a
> > reference to what all patches are required from parallel vacuum
> > thread. Another option is to move this discussion to the parallel
> > vacuum thread, but I think it is better to decide the costing model
> > here.
>
> I have attached the POC patches I have for testing. Step for testing
> 1. First, apply the parallel vacuum base patch and the shared costing patch[1].
> 2. Apply 0001-vacuum_costing_test.patch attached in the mail
> 3. Run the script shared in previous mail [2]. --> this will give the
> results for patch 1 shared upthread[2]
> 4. Apply patch shared_costing_plus_patch[2] or [3] or [4] to see the
> results with different approaches explained in the mail.
>
>
> [1] https://www.postgresql.org/message-id/CAD21AoAqT17QwKJ_sWOqRxNvg66wMw1oZZzf9Rt-E-zD%2BXOh_Q%40mail.gmail.com
> [2] https://www.postgresql.org/message-id/CAFiTN-tFLN%3Dvdu5Ra-23E9_7Z1JXkk5MkRY3Bkj2zAoWK7fULA%40mail.gmail.com
>
I have tested the same with some other workload(test file attached).
I can see the same behaviour with this workload as well that with the
patch 4 the distribution of the delay is better compared to other
patches i.e. worker with more I/O have more delay and with equal IO
have alsomost equal delay. Only thing is that the total delay with
the patch 4 is slightly less compared to other pacthes.

patch1:
worker 0 delay=120.000000 total io=35828 hit=35788 miss=0 dirty=2
worker 1 delay=170.000000 total io=35828 hit=35788 miss=0 dirty=2
worker 2 delay=210.000000 total io=35828 hit=35788 miss=0 dirty=2
worker 3 delay=263.400000 total io=44322 hit=8352 miss=1199 dirty=1199

patch2:
worker 0 delay=190.645000 total io=35828 hit=35788 miss=0 dirty=2
worker 1 delay=160.090000 total io=35828 hit=35788 miss=0 dirty=2
worker 2 delay=170.775000 total io=35828 hit=35788 miss=0 dirty=2
worker 3 delay=243.180000 total io=44322 hit=8352 miss=1199 dirty=1199

patch3:
worker 0 delay=191.765000 total io=35828 hit=35788 miss=0 dirty=2
worker 1 delay=180.935000 total io=35828 hit=35788 miss=0 dirty=2
worker 2 delay=201.305000 total io=35828 hit=35788 miss=0 dirty=2
worker 3 delay=192.770000 total io=44322 hit=8352 miss=1199 dirty=1199

patch4:
worker 0 delay=175.290000 total io=35828 hit=35788 miss=0 dirty=2
worker 1 delay=174.135000 total io=35828 hit=35788 miss=0 dirty=2
worker 2 delay=175.560000 total io=35828 hit=35788 miss=0 dirty=2
worker 3 delay=212.100000 total io=44322 hit=8352 miss=1199 dirty=1199

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
test1.sh text/x-sh 379 bytes
test1.sql application/octet-stream 460 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Sharma 2019-11-11 07:30:00 Re: tableam vs. TOAST
Previous Message Michael Paquier 2019-11-11 07:16:02 Re: pg_waldump and PREPARE