Re: [HACKERS] Block level parallel vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Steele <david(at)pgmasters(dot)net>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Block level parallel vacuum
Date: 2019-03-19 08:51:32
Message-ID: CAD21AoCUZQmyXrwDw57ejoR-j1QrGqm_vrQKOkif_aJK4Gih6Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 19, 2019 at 10:39 AM Haribabu Kommi
<kommi(dot)haribabu(at)gmail(dot)com> wrote:
>
>
> On Mon, Mar 18, 2019 at 1:58 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>>
>> On Tue, Feb 26, 2019 at 7:20 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> >
>> > On Tue, Feb 26, 2019 at 1:35 PM Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com> wrote:
>> > >
>> > > On Thu, Feb 14, 2019 at 9:17 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>> > >>
>> > >> Thank you. Attached the rebased patch.
>> > >
>> > >
>> > > I ran some performance tests to compare the parallelism benefits,
>> >
>> > Thank you for testing!
>> >
>> > > but I got some strange results of performance overhead, may be it is
>> > > because, I tested it on my laptop.
>> >
>> > Hmm, I think the parallel vacuum would help for heavy workloads like a
>> > big table with multiple indexes. In your test result, all executions
>> > are completed within 1 sec, which seems to be one use case that the
>> > parallel vacuum wouldn't help. I suspect that the table is small,
>> > right? Anyway I'll also do performance tests.
>> >
>>
>> Here is the performance test results. I've setup a 500MB table with
>> several indexes and made 10% of table dirty before each vacuum.
>> Compared execution time of the patched postgrse with the current HEAD
>> (at 'speed_up' column). In my environment,
>>
>> indexes | parallel_degree | patched | head | speed_up
>> ---------+-----------------+------------+------------+----------
>> 0 | 0 | 238.2085 | 244.7625 | 1.0275
>> 0 | 1 | 237.7050 | 244.7625 | 1.0297
>> 0 | 2 | 238.0390 | 244.7625 | 1.0282
>> 0 | 4 | 238.1045 | 244.7625 | 1.0280
>> 0 | 8 | 237.8995 | 244.7625 | 1.0288
>> 0 | 16 | 237.7775 | 244.7625 | 1.0294
>> 1 | 0 | 1328.8590 | 1334.9125 | 1.0046
>> 1 | 1 | 1325.9140 | 1334.9125 | 1.0068
>> 1 | 2 | 1333.3665 | 1334.9125 | 1.0012
>> 1 | 4 | 1329.5205 | 1334.9125 | 1.0041
>> 1 | 8 | 1334.2255 | 1334.9125 | 1.0005
>> 1 | 16 | 1335.1510 | 1334.9125 | 0.9998
>> 2 | 0 | 2426.2905 | 2427.5165 | 1.0005
>> 2 | 1 | 1416.0595 | 2427.5165 | 1.7143
>> 2 | 2 | 1411.6270 | 2427.5165 | 1.7197
>> 2 | 4 | 1411.6490 | 2427.5165 | 1.7196
>> 2 | 8 | 1410.1750 | 2427.5165 | 1.7214
>> 2 | 16 | 1413.4985 | 2427.5165 | 1.7174
>> 4 | 0 | 4622.5060 | 4619.0340 | 0.9992
>> 4 | 1 | 2536.8435 | 4619.0340 | 1.8208
>> 4 | 2 | 2548.3615 | 4619.0340 | 1.8126
>> 4 | 4 | 1467.9655 | 4619.0340 | 3.1466
>> 4 | 8 | 1486.3155 | 4619.0340 | 3.1077
>> 4 | 16 | 1481.7150 | 4619.0340 | 3.1174
>> 8 | 0 | 9039.3810 | 8990.4735 | 0.9946
>> 8 | 1 | 4807.5880 | 8990.4735 | 1.8701
>> 8 | 2 | 3786.7620 | 8990.4735 | 2.3742
>> 8 | 4 | 2924.2205 | 8990.4735 | 3.0745
>> 8 | 8 | 2684.2545 | 8990.4735 | 3.3493
>> 8 | 16 | 2672.9800 | 8990.4735 | 3.3635
>> 16 | 0 | 17821.4715 | 17740.1300 | 0.9954
>> 16 | 1 | 9318.3810 | 17740.1300 | 1.9038
>> 16 | 2 | 7260.6315 | 17740.1300 | 2.4433
>> 16 | 4 | 5538.5225 | 17740.1300 | 3.2030
>> 16 | 8 | 5368.5255 | 17740.1300 | 3.3045
>> 16 | 16 | 5291.8510 | 17740.1300 | 3.3523
>> (36 rows)
>
>
> The performance results are good. Do we want to add the recommended
> size in the document for the parallel option? the parallel option for smaller
> tables can lead to performance overhead.
>

Hmm, I don't think we can add the specific recommended size because
the performance gain by parallel lazy vacuum depends on various things
such as CPU cores, the number of indexes, shared buffer size, index
types, HDD or SSD. I suppose that users who want to use this option
have some sort of performance problem such as that vacuum takes a very
long time. They would use it for relatively larger tables.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2019-03-19 08:54:01 Re: Proposal to suppress errors thrown by to_reg*()
Previous Message Kyotaro HORIGUCHI 2019-03-19 08:23:42 Re: Proposal to suppress errors thrown by to_reg*()