Re: Parallel bitmap heap scan

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel bitmap heap scan
Date: 2016-10-19 15:53:56
Message-ID: CAFiTN-tEew6uUxgRv3nFaQTBD4Cku-XCpMMa0pMYXy1T4SGbmQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 19, 2016 at 12:39 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Try measuring with something more heavy on bitmap scan time
> itself. E.g.
> SELECT SUM(l_extendedprice) FROM lineitem WHERE (l_shipdate >= '1995-01-01'::date) AND (l_shipdate <= '1996-12-31'::date);
> or similar. The tpch queries don't actually spend that much time in the
> bitmapscan itself - the parallization of the rest of the query is what
> matters...

Yeah, I agree.

I have tested with this query, with exact filter condition it was
taking parallel sequence scan, so I have modified the filter a bit and
tested.

Tested with all default configuration in my local machine. I think I
will generate more such test cases and do detail testing in my
performance machine.

Explain Analyze results:
---------------------------------
On Head:
------------
postgres=# explain analyze SELECT SUM(l_extendedprice) FROM lineitem
WHERE (l_shipdate >= '1995-01-01'::date) AND (l_shipdate <=
'1996-03-31'::date);

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=848805.90..848805.91 rows=1 width=32) (actual
time=12440.165..12440.166 rows=1 loops=1)
-> Bitmap Heap Scan on lineitem (cost=143372.40..834833.25
rows=5589057 width=8) (actual time=1106.217..11183.722 rows=5678841
loops=1)
Recheck Cond: ((l_shipdate >= '1995-01-01'::date) AND
(l_shipdate <= '1996-03-31'::date))
Rows Removed by Index Recheck: 20678739
Heap Blocks: exact=51196 lossy=528664
-> Bitmap Index Scan on idx_lineitem_shipdate
(cost=0.00..141975.13 rows=5589057 width=0) (actual
time=1093.376..1093.376 rows=5678841 loops=1)
Index Cond: ((l_shipdate >= '1995-01-01'::date) AND
(l_shipdate <= '1996-03-31'::date))
Planning time: 0.185 ms
Execution time: 12440.819 ms
(9 rows)

After Patch:
---------------
postgres=# explain analyze SELECT SUM(l_extendedprice) FROM lineitem
WHERE (l_shipdate >= '1995-01-01'::date) AND (l_shipdate <=
'1996-03-31'::date);

QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------
---------
Finalize Aggregate (cost=792751.16..792751.17 rows=1 width=32)
(actual time=6660.157..6660.157 rows=1 loops=1)
-> Gather (cost=792750.94..792751.15 rows=2 width=32) (actual
time=6659.378..6660.117 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=791750.94..791750.95 rows=1
width=32) (actual time=6655.941..6655.941 rows=1 loops=3)
-> Parallel Bitmap Heap Scan on lineitem
(cost=143372.40..785929.00 rows=2328774 width=8) (actual
time=1980.797..6204.232 rows=1892947 loops=
3)
Recheck Cond: ((l_shipdate >= '1995-01-01'::date)
AND (l_shipdate <= '1996-03-31'::date))
Rows Removed by Index Recheck: 6930269
Heap Blocks: exact=17090 lossy=176443
-> Bitmap Index Scan on idx_lineitem_shipdate
(cost=0.00..141975.13 rows=5589057 width=0) (actual
time=1933.454..1933.454 rows=5678841
loops=1)
Index Cond: ((l_shipdate >=
'1995-01-01'::date) AND (l_shipdate <= '1996-03-31'::date))
Planning time: 0.207 ms
Execution time: 6669.195 ms
(13 rows)

Summary:
-> With patch overall execution is 2 time faster compared to head.
-> Bitmap creation with patch is bit slower compared to head and thats
because of DHT vs efficient hash table.

I found one defect in v2 patch, that I induced during last rebasing.
That is fixed in v3.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
parallel-bitmap-heap-scan-v3.patch application/octet-stream 70.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-10-19 16:04:27 Re: Patch: Implement failover on libpq connect level.
Previous Message Greg Stark 2016-10-19 15:50:17 packing/alignment annotation for ItemPointerData redux