Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, David Geier <geidav(dot)pg(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Melanie Plageman <melanieplageman(at)gmail(dot)com>
Subject: Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE
Date: 2024-03-14 15:30:30
Message-ID: ab3b05d5-c736-41d9-be6c-f747775baa8e@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18/02/2024 00:31, Tomas Vondra wrote:
> Do you plan to work continue working on this patch? I did take a look,
> and on the whole it looks reasonable - it modifies the right places etc.

+1

> I think there are two things that may need an improvement:
>
> 1) Storing variable-length data in ParallelBitmapHeapState
>
> I agree with Robert the snapshot_and_stats name is not great. I see
> Dmitry mentioned phs_snapshot_off as used by ParallelTableScanDescData -
> the reasons are somewhat different (phs_snapshot_off exists because we
> don't know which exact struct will be allocated), while here we simply
> need to allocate two variable-length pieces of memory. But it seems like
> it would work nicely for this. That is, we could try adding an offset
> for each of those pieces of memory:
>
> - snapshot_off
> - stats_off
>
> I don't like the GetSharedSnapshotData name very much, it seems very
> close to GetSnapshotData - quite confusing, I think.
>
> Dmitry also suggested we might add a separate piece of shared memory. I
> don't quite see how that would work for ParallelBitmapHeapState, but I
> doubt it'd be simpler than having two offsets. I don't think the extra
> complexity (paid by everyone) would be worth it just to make EXPLAIN
> ANALYZE work.

I just removed phs_snapshot_data in commit 84c18acaf6. I thought that
would make this moot, but now that I rebased this, there are stills some
aesthetic questions on how best to represent this.

In all the other node types that use shared instrumentation like this,
the pattern is as follows: (using Memoize here as an example, but it's
similar for Sort, IncrementalSort, Agg and Hash)

/* ----------------
* Shared memory container for per-worker memoize information
* ----------------
*/
typedef struct SharedMemoizeInfo
{
int num_workers;
MemoizeInstrumentation sinstrument[FLEXIBLE_ARRAY_MEMBER];
} SharedMemoizeInfo;

/* this struct is backend-private */
typedef struct MemoizeState
{
ScanState ss; /* its first field is NodeTag */
...
MemoizeInstrumentation stats; /* execution statistics */
SharedMemoizeInfo *shared_info; /* statistics for parallel workers */
} MemoizeState;

While the scan is running, the node updates its private data in
MemoizeState->stats. At the end of a parallel scan, the worker process
copies the MemoizeState->stats to MemoizeState->shared_info->stats,
which lives in shared memory. The leader process copies
MemoizeState->shared_info->stats to its own backend-private copy, which
it then stores in its MemoizeState->shared_info, replacing the pointer
to the shared memory with a pointer to the private copy. That happens in
ExecMemoizeRetrieveInstrumentation().

This is a little different for parallel bitmap heap scans, because a
bitmap heap scan keeps some other data in shared memory too, not just
instrumentation data. Also, the naming is inconsistent: the equivalent
of SharedMemoizeInfo is actually called ParallelBitmapHeapState. I think
we should rename it to SharedBitmapHeapInfo, to make it clear that it
lives in shared memory, but I digress.

We could now put the new stats at the end of ParallelBitmapHeapState as
a varlen field. But I'm not sure that's a good idea. In
ExecBitmapHeapRetrieveInstrumentation(), would we make a backend-private
copy of the whole ParallelBitmapHeapState struct, even though the other
fields don't make sense after the shared memory is released? Sounds
confusing. Or we could introduce a separate struct for the stats, and
copy just that:

typedef struct SharedBitmapHeapInstrumentation
{
int num_workers;
BitmapHeapScanInstrumentation sinstrument[FLEXIBLE_ARRAY_MEMBER];
} SharedBitmapHeapInstrumentation;

typedef struct BitmapHeapScanState
{
ScanState ss; /* its first field is NodeTag */
...
SharedBitmapHeapInstrumentation sinstrument;
} BitmapHeapScanState;

that compiles, at least with my compiler, but I find it weird to have a
variable-length inner struct embedded in an outer struct like that.

Long story short, I think it's still better to store
ParallelBitmapHeapInstrumentationInfo separately in the DSM chunk, not
as part of ParallelBitmapHeapState. Attached patch does that, rebased
over current master.

I didn't address any of the other things that you, Tomas, pointed out,
but I think they're valid concerns.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachment Content-Type Size
v3-0001-Parallel-Bitmap-Heap-Scan-reports-per-worker-stat.patch text/x-patch 11.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-03-14 15:45:00 Re: Add new protocol message to change GUCs for usage with future protocol-only GUCs
Previous Message Peter Eisentraut 2024-03-14 15:10:29 Re: UUID v7