Re: Parallel bitmap index scan

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel bitmap index scan
Date: 2020-07-26 13:57:45
Message-ID: CAFiTN-sH1D2CxCjVUPgQvprPHy0Y9jnFiO2Rg2--Z9fJobUGqw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jul 26, 2020 at 6:42 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> I would like to propose a patch for enabling the parallelism for the
> bitmap index scan path.
>
> Background:
> Currently, we support only a parallel bitmap heap scan path. Therein,
> the underlying bitmap index scan is done by a single worker called the
> leader. The leader creates a bitmap in shared memory and once the
> bitmap is ready it creates a shared iterator and after that, all the
> workers process the shared iterator and scan the heap in parallel.
> While analyzing the TPCH plan we have observed that some of the
> queries are spending significant time in preparing the bitmap. So the
> idea of this patch is to use the parallel index scan for preparing the
> underlying bitmap in parallel.
>
> Design:
> If underlying index AM supports the parallel path (currently only
> BTREE support it), then we will create a parallel bitmap heap scan
> path on top of the parallel bitmap index scan path. So the idea of
> this patch is that each worker will do the parallel index scan and
> generate their part of the bitmap. And, we will create a barrier so
> that we can not start preparing the shared iterator until all the
> worker is ready with their bitmap. The first worker, which is ready
> with the bitmap will keep a copy of its TBM and the page table in the
> shared memory. And, all the subsequent workers will merge their TBM
> with the shared TBM. Once all the TBM are merged we will get one
> common shared TBM and after that stage, the worker can continue. The
> remaining part is the same, basically, again one worker will scan the
> shared TBM and prepare the shared iterator and once it is ready all
> the workers will jointly scan the heap in parallel using shared
> iterator.
>
> BitmapHeapNext
> {
> ...
> BarrierAttach();
> tbm = MultiExecProcNode();
> tbm_merge(tbm); --Merge with common tbm using tbm_union
> BarrierArriveAndWait();
>
> if (BitmapShouldInitializeSharedState(pstate)). --> only one worker
> come out of this
> {
> tbm_prepare_shared_iterate();
> BitmapDoneInitializingSharedState(). -->wakeup others
> }
> tbm_attach_shared_iterate(). --> all worker attach to shared iterator
> ...
> }
>
> Performance: With scale factor 10, I could see that Q6 is spending
> significant time in a bitmap index scan so I have taken the
> performance with that query and I can see that the bitmap index scan
> node is 3x faster by using 3 workers whereas overall plan got ~40%
> faster.
>
> TPCH: S.F. 10, work_mem=512MB shared_buffers: 1GB
>
> HEAD:
> Limit (cost=1559777.02..1559777.03 rows=1 width=32) (actual
> time=5260.121..5260.122 rows=1 loops=1)
> -> Finalize Aggregate (cost=1559777.02..1559777.03 rows=1
> width=32) (actual time=5260.119..5260.119 rows=1 loops=1)
> -> Gather (cost=1559776.69..1559777.00 rows=3 width=32)
> (actual time=5257.251..5289.595 rows=4 loops=1)
> Workers Planned: 3
> Workers Launched: 3
> -> Partial Aggregate (cost=1558776.69..1558776.70
> rows=1 width=32) (actual time=5247.714..5247.714 rows=1 loops=4)
> -> Parallel Bitmap Heap Scan on lineitem
> (cost=300603.01..1556898.89 rows=375560 width=12) (actual
> time=3475.944..50
> 37.484 rows=285808 loops=4)
> Recheck Cond: ((l_shipdate >=
> '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp
> without tim
> e zone) AND (l_discount >= 0.02) AND (l_discount <= 0.04) AND
> (l_quantity < '24'::numeric))
> Heap Blocks: exact=205250
> -> Bitmap Index Scan on
> idx_lineitem_shipdate (cost=0.00..300311.95 rows=1164235 width=0)
> (actual time=3169.85
> 5..3169.855 rows=1143234 loops=1)
> Index Cond: ((l_shipdate >=
> '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp
> without
> time zone) AND (l_discount >= 0.02) AND (l_discount <= 0.04) AND
> (l_quantity < '24'::numeric))
> Planning Time: 0.659 ms
> Execution Time: 5289.787 ms
> (13 rows)
>
>
> PATCH:
>
> Limit (cost=1559579.85..1559579.86 rows=1 width=32) (actual
> time=3333.572..3333.572 rows=1 loops=1)
> -> Finalize Aggregate (cost=1559579.85..1559579.86 rows=1
> width=32) (actual time=3333.569..3333.569 rows=1 loops=1)
> -> Gather (cost=1559579.52..1559579.83 rows=3 width=32)
> (actual time=3328.619..3365.227 rows=4 loops=1)
> Workers Planned: 3
> Workers Launched: 3
> -> Partial Aggregate (cost=1558579.52..1558579.53
> rows=1 width=32) (actual time=3307.805..3307.805 rows=1 loops=4)
> -> Parallel Bitmap Heap Scan on lineitem
> (cost=300405.84..1556701.72 rows=375560 width=12) (actual
> time=1585.726..30
> 97.628 rows=285808 loops=4)
> Recheck Cond: ((l_shipdate >=
> '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp
> without tim
> e zone) AND (l_discount >= 0.02) AND (l_discount <= 0.04) AND
> (l_quantity < '24'::numeric))
> Heap Blocks: exact=184293
> -> Parallel Bitmap Index Scan on
> idx_lineitem_shipdate (cost=0.00..300311.95 rows=1164235 width=0)
> (actual tim
> e=1008.361..1008.361 rows=285808 loops=4)
> Index Cond: ((l_shipdate >=
> '1997-01-01'::date) AND (l_shipdate < '1998-01-01 00:00:00'::timestamp
> without
> time zone) AND (l_discount >= 0.02) AND (l_discount <= 0.04) AND
> (l_quantity < '24'::numeric))
> Planning Time: 0.690 ms
> Execution Time: 3365.420 ms
>
> Note:
> - Currently, I have only parallelized then bitmap index path when we
> have a bitmap index scan directly under bitmap heap. But, if we have
> BitmapAnd or BitmapOr path then I did not parallelize the underlying
> bitmap index scan. I think for BitmapAnd and BitmapOr we should use a
> completely different design, something similar to what we are doing in
> parallel append so I don't think BitmapAnd and BitmapOr we need to
> cover under this patch.
>
> - POC patch is attached to discuss the idea. The patch still needs
> cleanup and testing.
>

There was one compilation warning so fixed in this version.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v2-0001-POC-Parallel-Bitmap-Index-Scan.patch application/octet-stream 34.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Travers 2020-07-26 17:27:15 Re: Making CASE error handling less surprising
Previous Message Dilip Kumar 2020-07-26 13:43:28 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions