Proposal: Improve bitmap costing for lossy pages

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Proposal: Improve bitmap costing for lossy pages
Date: 2017-05-18 06:52:23
Message-ID: CAFiTN-sYtqUOXQ4SpuhTv0Z9gD0si3YxZGv_PQAAMX8qbOotcg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I would like to propose a patch to improve the cost of bitmap heap
scan that is sensitive to work_mem. Currently, in bitmap scan, we
don't consider work_mem. Now, in cases when there are a lot of lossy
pages bitmap scan gets selected that eventually leads to degraded
performance.

While evaluating parallel bitmap heap scan on TPCH we noticed that in
many queries selecting bitmap heap scan gives good performance high
work_mem but at low work_mem it slows the query compared to sequence
scan or index scan. This shows that bitmap heap scan is a better
alternative when most of the heap pages fit into work_mem.

Attached POC patch fixes the problem by considering work_mem for bitmap costing.

Performance numbers with this patch on different values of work_mem
are as follows,
workload: TPCH scale factor 20
machine: POWER 8

work_mem = 4MB
Query Head(ms) Patch(ms) Improvement Change in plan
4 13759.632 14464.491 0.95x PBHS -> PSS
5 47581.558 41888.853 1.14x BHS -> SS
6 14051.553 13853.449 1.01x PBHS -> PSS
8 21529.98 11289.25 1.91x PBHS -> PSS
10 37844.51 34460.669 1.10x BHS -> SS
14 10131.49 15281.49 0.66x BHS -> SS
15 43579.833 34971.051 1.25x BHS -> SS

work_mem = 20MB
Query Head(ms) Patch(ms) Improvement Change in plan
6 14592 13521.06 1.08x PBHS -> PSS
8 20223.106 10716.062 1.89x PBHS -> PSS
15 40486.957 33687.706 1.20x BHS -> PSS

work_mem = 64MB
Query Head(ms) Patch(ms) Improvement Change in plan
15 40904.572 25750.873 1.59x BHS -> PBHS

work_mem = 1GB
No plan got changed

Most of the queries show decent improvement, however, Q14 shows
regression at work_mem = 4MB. On analysing this case, I found that
number of pages_fetched calculated by "Mackert and Lohman formula" is
very high (1112817) compared to the actual unique heap pages fetched
(293314). Therefore, while costing bitmap scan using 1112817 pages and
4MB of work_mem, we predicted that even after we lossify all the pages
it can not fit into work_mem, hence bitmap scan was not selected.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
improve_bitmap_cost_v1.patch application/octet-stream 3.9 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2017-05-18 07:17:10 Typo in json.c
Previous Message Masahiko Sawada 2017-05-18 06:45:36 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands