Re: cost based vacuum (parallel)

From: Mahendra Singh <mahi6run(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: cost based vacuum (parallel)
Date: 2019-11-14 11:32:26
Message-ID: CAKYtNAp-pRCr5C06iXDCetCDqceuoEAGfkfikjug03pznZ2apA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 11 Nov 2019 at 17:56, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Mon, Nov 11, 2019 at 5:14 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Mon, Nov 11, 2019 at 4:23 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:
> > >
> > > ..
> > > > I have tested the same with some other workload(test file attached).
> > > > I can see the same behaviour with this workload as well that with
the
> > > > patch 4 the distribution of the delay is better compared to other
> > > > patches i.e. worker with more I/O have more delay and with equal IO
> > > > have alsomost equal delay. Only thing is that the total delay with
> > > > the patch 4 is slightly less compared to other pacthes.
> > > >
> > >
> > > I see one problem with the formula you have used in the patch, maybe
> > > that is causing the value of total delay to go down.
> > >
> > > - if (new_balance >= VacuumCostLimit)
> > > + VacuumCostBalanceLocal += VacuumCostBalance;
> > > + if ((new_balance >= VacuumCostLimit) &&
> > > + (VacuumCostBalanceLocal > VacuumCostLimit/(0.5 * nworker)))
> > >
> > > As per discussion, the second part of the condition should be
> > > "VacuumCostBalanceLocal > (0.5) * VacuumCostLimit/nworker".
> > My Bad
> > I think
> > > you can once change this and try again. Also, please try with the
> > > different values of threshold (0.3, 0.5, 0.7, etc.).
> > >
> > Okay, I will retest with both patch3 and path4 for both the scenarios.
> > I will also try with different multipliers.
> >
>
> One more thing, I think we should also test these cases with a varying
> number of indexes (say 2,6,8,etc.) and then probably, we should test
> by a varying number of workers where the number of workers are lesser
> than indexes. You can do these after finishing your previous
> experiments.

On the top of parallel vacuum patch, I applied Dilip's
patch(0001-vacuum_costing_test.patch). I have tested by varying number of
indexes and number of workers. I compared shared
costing(0001-vacuum_costing_test.patch) vs shared costing latest
patch(shared_costing_plus_patch4_v1.patch).
With shared costing base patch, I can see that delay is not in sync
compared to I/O which is resolved by applying patch
(shared_costing_plus_patch4_v1.patch). I have also observed that total
delay is slightly reduced with shared_costing_plus_patch4_v1.patch patch.

Below is the full testing summary:
*Test setup:*
step1) Apply parallel vacuum patch
step2) Apply 0001-vacuum_costing_test.patch patch (on the top of this
patch, delay is not in sync compared to I/O)
step3) Apply shared_costing_plus_patch4_v1.patch (delay is in sync compared
to I/O)

*Configuration settings:*
autovacuum = off
max_parallel_workers = 30
shared_buffers = 2GB
max_parallel_maintenance_workers = 20
vacuum_cost_limit = 2000
vacuum_cost_delay = 10

*Test 1: Varry indexes(2,4,6,8) but parallel workers are fixed as 4:*

Case 1) When indexes are 2:
*Without shared_costing_plus_patch4_v1.patch:*
WARNING: worker 0 delay=120.000000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 1 delay=60.000000 total io=17931 hit=17891 miss=0 dirty=2

*With shared_costing_plus_patch4_v1.patch:*
WARNING: worker 0 delay=87.780000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 1 delay=87.995000 total io=17931 hit=17891 miss=0 dirty=2

Case 2) When indexes are 4:
*Without shared_costing_plus_patch4_v1.patch:*
WARNING: worker 0 delay=120.000000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 1 delay=80.000000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 2 delay=60.000000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 3 delay=100.000000 total io=17931 hit=17891 miss=0 dirty=2

*With shared_costing_plus_patch4_v1.patch:*
WARNING: worker 0 delay=87.430000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 1 delay=87.175000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 2 delay=86.340000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 3 delay=88.020000 total io=17931 hit=17891 miss=0 dirty=2

Case 3) When indexes are 6:
*Without shared_costing_plus_patch4_v1.patch:*
WARNING: worker 0 delay=110.000000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 1 delay=100.000000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 2 delay=160.000000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 3 delay=90.000000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 4 delay=80.000000 total io=17931 hit=17891 miss=0 dirty=2

*With shared_costing_plus_patch4_v1.patch*:
WARNING: worker 0 delay=173.195000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 1 delay=88.715000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 2 delay=87.710000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 3 delay=86.460000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 4 delay=89.435000 total io=17931 hit=17891 miss=0 dirty=2

Case 4) When indexes are 8:
*Without shared_costing_plus_patch4_v1.patch:*
WARNING: worker 0 delay=170.000000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 1 delay=120.000000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 2 delay=130.000000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 3 delay=190.000000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 4 delay=110.000000 total io=35862 hit=35782 miss=0 dirty=4

*With shared_costing_plus_patch4_v1.patch*:
WARNING: worker 0 delay=174.700000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 1 delay=177.880000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 2 delay=89.460000 total io=17931 hit=17891 miss=0 dirty=2
WARNING: worker 3 delay=177.320000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 4 delay=86.810000 total io=17931 hit=17891 miss=0 dirty=2

*Test 2: Indexes are 16 but parallel workers are 2,4,8:*

Case 1) When 2 parallel workers:
*Without shared_costing_plus_patch4_v1.patch:*
WARNING: worker 0 delay=1513.230000 total io=307197 hit=85167 miss=22179
dirty=12
WARNING: worker 1 delay=1543.385000 total io=326553 hit=63133 miss=26322
dirty=10
WARNING: worker 2 delay=1633.625000 total io=302199 hit=65839 miss=23616
dirty=10

*With shared_costing_plus_patch4_v1.patch:*
WARNING: worker 0 delay=1539.475000 total io=308175 hit=65175 miss=24280
dirty=10
WARNING: worker 1 delay=1251.200000 total io=250692 hit=71562 miss=17893
dirty=10
WARNING: worker 2 delay=1143.690000 total io=228987 hit=93857 miss=13489
dirty=12

Case 2) When 4 parallel workers:
*Without shared_costing_plus_patch4_v1.patch:*
WARNING: worker 0 delay=1182.430000 total io=213567 hit=16037 miss=19745
dirty=4
WARNING: worker 1 delay=1202.710000 total io=178941 hit=1 miss=17890
dirty=2
WARNING: worker 2 delay=210.000000 total io=89655 hit=89455 miss=0 dirty=10
WARNING: worker 3 delay=270.000000 total io=71724 hit=71564 miss=0 dirty=8
WARNING: worker 4 delay=851.825000 total io=188229 hit=58619 miss=12945
dirty=8

*With shared_costing_plus_patch4_v1.patch:*
WARNING: worker 0 delay=1136.875000 total io=227679 hit=14469 miss=21313
dirty=4
WARNING: worker 1 delay=973.745000 total io=196881 hit=17891 miss=17891
dirty=4
WARNING: worker 2 delay=447.410000 total io=89655 hit=89455 miss=0 dirty=10
WARNING: worker 3 delay=833.235000 total io=168228 hit=40958 miss=12715
dirty=6
WARNING: worker 4 delay=683.200000 total io=136488 hit=64368 miss=7196
dirty=8

Case 3) When 8 parallel workers:
*Without shared_costing_plus_patch4_v1.patch:*
WARNING: worker 0 delay=1022.300000 total io=178941 hit=1 miss=17890
dirty=2
WARNING: worker 1 delay=1072.770000 total io=178941 hit=1 miss=17890
dirty=2
WARNING: worker 2 delay=170.000000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 3 delay=170.000000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 4 delay=140.035000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 5 delay=200.000000 total io=53802 hit=53672 miss=1 dirty=6
WARNING: worker 6 delay=130.000000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 7 delay=150.000000 total io=53793 hit=53673 miss=0 dirty=6

*With shared_costing_plus_patch4_v1.patch:*
WARNING: worker 0 delay=872.800000 total io=178941 hit=1 miss=17890 dirty=2
WARNING: worker 1 delay=885.950000 total io=178941 hit=1 miss=17890 dirty=2
WARNING: worker 2 delay=175.680000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 3 delay=259.560000 total io=53793 hit=53673 miss=0 dirty=6
WARNING: worker 4 delay=169.945000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 5 delay=613.845000 total io=125100 hit=45750 miss=7923
dirty=6
WARNING: worker 6 delay=171.895000 total io=35862 hit=35782 miss=0 dirty=4
WARNING: worker 7 delay=176.505000 total io=35862 hit=35782 miss=0 dirty=4

Thanks and Regards
Mahendra Thalor
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message didier 2019-11-14 12:07:54 Re: [PATCH] gcc warning 'expression which evaluates to zero treated as a null pointer'
Previous Message vignesh C 2019-11-14 11:12:09 Re: dropdb --force