Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE

From: David Geier <geidav(dot)pg(at)gmail(dot)com>
To: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE
Date: 2023-01-20 08:34:26
Message-ID: b3d80961-c2e5-38cc-6a32-61886cdf766d@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

EXPLAIN ANALYZE for parallel Bitmap Heap Scans currently only reports
the number of heap blocks processed by the leader. It's missing the
per-worker stats. The attached patch adds that functionality in the
spirit of e.g. Sort or Memoize. Here is a simple test case and the
EXPLAIN ANALYZE output with and without the patch:

create table foo(col0 int, col1 int);
insert into foo select generate_series(1, 1000, 0.001),
generate_series(1000, 2000, 0.001);
create index idx0 on foo(col0);
create index idx1 on foo(col1);
set parallel_tuple_cost = 0;
set parallel_setup_cost = 0;
explain (analyze, costs off, timing off) select * from foo where col0 >
900 or col1 = 1;

With the patch:

 Gather (actual rows=99501 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Bitmap Heap Scan on foo (actual rows=33167 loops=3)
         Recheck Cond: ((col0 > 900) OR (col1 = 1))
         Heap Blocks: exact=98
         Worker 0:  Heap Blocks: exact=171 lossy=0
         Worker 1:  Heap Blocks: exact=172 lossy=0
         ->  BitmapOr (actual rows=0 loops=1)
               ->  Bitmap Index Scan on idx0 (actual rows=99501 loops=1)
                     Index Cond: (col0 > 900)
               ->  Bitmap Index Scan on idx1 (actual rows=0 loops=1)
                     Index Cond: (col1 = 1)

Without the patch:

 Gather (actual rows=99501 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Bitmap Heap Scan on foo (actual rows=33167 loops=3)
         Recheck Cond: ((col0 > 900) OR (col1 = 1))
         Heap Blocks: exact=91
         ->  BitmapOr (actual rows=0 loops=1)
               ->  Bitmap Index Scan on idx0 (actual rows=99501 loops=1)
                     Index Cond: (col0 > 900)
               ->  Bitmap Index Scan on idx1 (actual rows=0 loops=1)
                     Index Cond: (col1 = 1)

So in total the parallel Bitmap Heap Scan actually processed 441 heap
blocks instead of just 91.

Now two variable length arrays (VLA) would be needed, one for the
snapshot and one for the stats. As this obviously doesn't work, I now
use a single, big VLA and added functions to retrieve pointers to the
respective fields. I'm using MAXALIGN() to make sure the latter field is
aligned properly. Am I doing this correctly? I'm not entirely sure
around alignment conventions and requirements of other platforms.

I couldn't find existing tests that exercise the EXPLAIN ANALYZE output
of specific nodes. I could only find a few basic smoke tests for EXPLAIN
ANALYZE with parallel nodes in parallel_select.sql. Do we want tests for
the changed functionality? If so I could right away also add tests for
EXPLAIN ANALYZE including other parallel nodes.

Thank you for your feedback.

--
David Geier
(ServiceNow)

Attachment Content-Type Size
v1-0001-Parallel-Bitmap-Heap-Scan-reports-per-worker-stat.patch text/x-patch 11.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2023-01-20 08:53:45 Re: Time delayed LR (WAS Re: logical replication restrictions)
Previous Message Jeff Davis 2023-01-20 08:04:12 Re: Non-superuser subscription owners