Re: Optimization for lazy_scan_heap

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Optimization for lazy_scan_heap
Date: 2016-09-08 08:03:42
Message-ID: CAD21AoCMz5GuSdFBgbmhq2OqFGgr=wGrKop+n1SsrwQCmahQTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 7, 2016 at 4:11 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> On 7 September 2016 at 04:13, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
>> Since current HEAD could scan visibility map twice, the execution time
>> of Patched is approximately half of HEAD.
>
> Sounds good.
>
> To ensure we are doing exactly same amount of work as before, did you
> see the output of VACUUM VEROBOSE?

Sorry, the previous test result I posted was something wrong.
I rerun the performance test and results are,

* 1TB Table(visibility map size is 32MB)
HEAD : 4853.250 ms (00:04.853)
Patched : 3805.789 ms (00:03.806)

* 8TB Table(visibility map size is 257MB)
HEAD : 37853.891 ms (00:37.854)
Patched : 30153.943 ms (00:30.154)

* 32TB Table(visibility map size is 1GB)
HEAD: 151908.344 ms (02:31.908)
Patched: 120560.037 ms (02:00.560)

Since visibility map page can be cached onto shared buffer or OS cache
by first scanning, the benefit of this patch seems not to be large.

Here are outputs of VACUUM VERBOSE for 32TB table.

* HEAD
INFO: vacuuming "public.vm_skip_test"
INFO: "vm_skip_test": found 0 removable, 0 nonremovable row versions
in 0 out of 4294967294 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
Skipped 4294967294 all-frozen pages according to visibility map.
0 pages are entirely empty.
CPU 1.06s/148.11u sec elapsed 149.20 sec.
VACUUM
Time: 151908.344 ms (02:31.908)

* Patched
INFO: vacuuming "public.vm_skip_test"
INFO: "vm_skip_test": found 0 removable, 0 nonremovable row versions
in 0 out of 4294967294 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
Skipped 4294967294 all-frozen pages according to visibility map.
0 pages are entirely empty.
CPU 0.65s/117.15u sec elapsed 117.81 sec.
VACUUM
Time: 120560.037 ms (02:00.560)

Current manual vacuum doesn't output how may all_frozen pages we
skipped according to visibility map.
That's why I attached 0001 patch which makes the manual vacuum emit
such information.

>
> Can we produce a test that verifies the result patched/unpatched?
>

Attached test shell script but because I don't have such a large disk,
I've measured performance benefit using by something like unofficial
way.

To make a situation where table is extremly large and make
corresponding visibility map, I applied 0002 patch and made a fake
visibility map.
Attached 0002 patch adds GUC parameter cheat_vacuum_table_size which
artificially defines table size being vacuumed .
For example, If we do,
SET cheat_vacuum_table_size = 4;
VACUUM vm_test;
then in lazy_scan_heap, vm_test table is processed as an
8TB(MaxBlockNumber / 4) table.

Attached test shell script makes fake visibility map files and
executes the performance tests for 1TB, 8TB and 32TB table.
Please confirm it.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
0001-lazy_scan_heap-outputs-how-many-all_frozen-pages-are.patch text/x-patch 1.1 KB
0002-Add-cheat_vacuum_table_size.patch text/x-patch 3.6 KB
test_vacuum_skip_with_large_vm.sh application/x-sh 1.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias Kurz 2016-09-08 08:13:01 Re: [PATCH] Alter or rename enum value
Previous Message Michael Paquier 2016-09-08 07:50:40 Re: WAL consistency check facility