Re: Block level parallel vacuum WIP

From: Васильев Дмитрий <d(dot)vasilyev(at)postgrespro(dot)ru>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Block level parallel vacuum WIP
Date: 2016-08-23 12:25:14
Message-ID: CAB-SwXaDPUY1t3SHEQJUDsgXHehXBiGjCmQj=rnSUSr8K_ob1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I repeat your test on ProLiant DL580 Gen9 with Xeon E7-8890 v3.

pgbench -s 100 and command vacuum pgbench_acounts after 10_000 transactions:

with: alter system set vacuum_cost_delay to DEFAULT;
parallel_vacuum_workers | time
1 | 138.703,263 ms
2 | 83.751,064 ms
4 | 66.105,861 ms
​ 8 | 59.820,171 ms

with: alter system set vacuum_cost_delay to 1;
parallel_vacuum_workers | time
1 | 127.210,896 ms
2 | 75.300,278 ms
4 | 64.253,087 ms
​ 8 | 60.130,953

---
Dmitry Vasilyev
Postgres Professional: http://www.postgrespro.ru
The Russian Postgres Company

2016-08-23 14:02 GMT+03:00 Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>:

> Hi all,
>
> I'd like to propose block level parallel VACUUM.
> This feature makes VACUUM possible to use multiple CPU cores.
>
> Vacuum Processing Logic
> ===================
>
> PostgreSQL VACUUM processing logic consists of 2 phases,
> 1. Collecting dead tuple locations on heap.
> 2. Reclaiming dead tuples from heap and indexes.
> These phases 1 and 2 are executed alternately, and once amount of dead
> tuple location reached maintenance_work_mem in phase 1, phase 2 will
> be executed.
>
> Basic Design
> ==========
>
> As for PoC, I implemented parallel vacuum so that each worker
> processes both 1 and 2 phases for particular block range.
> Suppose we vacuum 1000 blocks table with 4 workers, each worker
> processes 250 consecutive blocks in phase 1 and then reclaims dead
> tuples from heap and indexes (phase 2).
> To use visibility map efficiency, each worker scan particular block
> range of relation and collect dead tuple locations.
> After each worker finished task, the leader process gathers these
> vacuum statistics information and update relfrozenxid if possible.
>
> I also changed the buffer lock infrastructure so that multiple
> processes can wait for cleanup lock on a buffer.
> And the new GUC parameter vacuum_parallel_workers controls the number
> of vacuum workers.
>
> Performance(PoC)
> =========
>
> I ran parallel vacuum on 13GB table (pgbench scale 1000) with several
> workers (on my poor virtual machine).
> The result is,
>
> 1. Vacuum whole table without index (disable page skipping)
> 1 worker : 33 sec
> 2 workers : 27 sec
> 3 workers : 23 sec
> 4 workers : 22 sec
>
> 2. Vacuum table and index (after 10000 transaction executed)
> 1 worker : 12 sec
> 2 workers : 49 sec
> 3 workers : 54 sec
> 4 workers : 53 sec
>
> As a result of my test, since multiple process could frequently try to
> acquire the cleanup lock on same index buffer, execution time of
> parallel vacuum got worse.
> And it seems to be effective for only table vacuum so far, but is not
> improved as expected (maybe disk bottleneck).
>
> Another Design
> ============
> ISTM that processing index vacuum by multiple process is not good idea
> in most cases because many index items can be stored in a page and
> multiple vacuum worker could try to require the cleanup lock on the
> same index buffer.
> It's rather better that multiple workers process particular block
> range and then multiple workers process each particular block range,
> and then one worker per index processes index vacuum.
>
> Still lots of work to do but attached PoC patch.
> Feedback and suggestion are very welcome.
>
> Regards,
>
> --
> Masahiko Sawada
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Claudio Freire 2016-08-23 12:25:20 Re: Block level parallel vacuum WIP
Previous Message Erik Rijkers 2016-08-23 12:13:33 comment typo lmgr.c