Re: Use streaming read API in ANALYZE

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, jakub(dot)wartak(at)enterprisedb(dot)com
Subject: Re: Use streaming read API in ANALYZE
Date: 2024-04-03 20:44:20
Message-ID: 20240403204420.jtzssefkzxtdzouo@liskov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 03, 2024 at 10:25:01PM +0300, Nazir Bilal Yavuz wrote:
>
> I realized the same error while working on Jakub's benchmarking results.
>
> Cause: I was using the nblocks variable to check how many blocks will
> be returned from the streaming API. But I realized that sometimes the
> number returned from BlockSampler_Init() is not equal to the number of
> blocks that BlockSampler_Next() will return as BlockSampling algorithm
> decides how many blocks to return on the fly by using some random
> seeds.
>
> There are a couple of solutions I thought of:
>
> 1- Use BlockSampler_HasMore() instead of nblocks in the main loop in
> the acquire_sample_rows():
>
> Streaming API uses this function to prefetch block numbers.
> BlockSampler_HasMore() will reach to the end first as it is used while
> prefetching, so it will start to return false while there are still
> buffers to return from the streaming API. That will cause some buffers
> at the end to not be processed.
>
> 2- Expose something (function, variable etc.) from the streaming API
> to understand if the read is finished and there is no buffer to
> return:
>
> I think this works but I am not sure if the streaming API allows
> something like that.
>
> 3- Check every buffer returned from the streaming API, if it is
> invalid stop the main loop in the acquire_sample_rows():
>
> This solves the problem but there will be two if checks for each
> buffer returned,
> - in heapam_scan_analyze_next_block() to check if the returned buffer is invalid
> - to break main loop in acquire_sample_rows() if
> heapam_scan_analyze_next_block() returns false
> One of the if cases can be bypassed by moving
> heapam_scan_analyze_next_block()'s code to the main loop in the
> acquire_sample_rows().
>
> I implemented the third solution, here is v6.

I've reviewed the patches inline below and attached a patch that has
some of my ideas on top of your patch.

> From 8d396a42186325f920d5a05e7092d8e1b66f3cdf Mon Sep 17 00:00:00 2001
> From: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
> Date: Wed, 3 Apr 2024 15:14:15 +0300
> Subject: [PATCH v6] Use streaming read API in ANALYZE
>
> ANALYZE command gets random tuples using BlockSampler algorithm. Use
> streaming reads to get these tuples by using BlockSampler algorithm in
> streaming read API prefetch logic.
> ---
> src/include/access/heapam.h | 6 +-
> src/backend/access/heap/heapam_handler.c | 22 +++---
> src/backend/commands/analyze.c | 85 ++++++++----------------
> 3 files changed, 42 insertions(+), 71 deletions(-)
>
> diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
> index a307fb5f245..633caee9d95 100644
> --- a/src/include/access/heapam.h
> +++ b/src/include/access/heapam.h
> @@ -25,6 +25,7 @@
> #include "storage/bufpage.h"
> #include "storage/dsm.h"
> #include "storage/lockdefs.h"
> +#include "storage/read_stream.h"
> #include "storage/shm_toc.h"
> #include "utils/relcache.h"
> #include "utils/snapshot.h"
> @@ -388,9 +389,8 @@ extern bool HeapTupleIsSurelyDead(HeapTuple htup,
> struct GlobalVisState *vistest);
>
> /* in heap/heapam_handler.c*/
> -extern void heapam_scan_analyze_next_block(TableScanDesc scan,
> - BlockNumber blockno,
> - BufferAccessStrategy bstrategy);
> +extern bool heapam_scan_analyze_next_block(TableScanDesc scan,
> + ReadStream *stream);
> extern bool heapam_scan_analyze_next_tuple(TableScanDesc scan,
> TransactionId OldestXmin,
> double *liverows, double *deadrows,
> diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
> index 0952d4a98eb..d83fbbe6af3 100644
> --- a/src/backend/access/heap/heapam_handler.c
> +++ b/src/backend/access/heap/heapam_handler.c
> @@ -1054,16 +1054,16 @@ heapam_relation_copy_for_cluster(Relation OldHeap, Relation NewHeap,
> }
>
> /*
> - * Prepare to analyze block `blockno` of `scan`. The scan has been started
> - * with SO_TYPE_ANALYZE option.
> + * Prepare to analyze block returned from streaming object. If the block returned
> + * from streaming object is valid, true is returned; otherwise false is returned.
> + * The scan has been started with SO_TYPE_ANALYZE option.
> *
> * This routine holds a buffer pin and lock on the heap page. They are held
> * until heapam_scan_analyze_next_tuple() returns false. That is until all the
> * items of the heap page are analyzed.
> */
> -void
> -heapam_scan_analyze_next_block(TableScanDesc scan, BlockNumber blockno,
> - BufferAccessStrategy bstrategy)
> +bool
> +heapam_scan_analyze_next_block(TableScanDesc scan, ReadStream *stream)
> {
> HeapScanDesc hscan = (HeapScanDesc) scan;
>
> @@ -1076,11 +1076,15 @@ heapam_scan_analyze_next_block(TableScanDesc scan, BlockNumber blockno,
> * doing much work per tuple, the extra lock traffic is probably better
> * avoided.

Personally I think heapam_scan_analyze_next_block() should be inlined.
It only has a few lines. I would find it clearer inline. At the least,
there is no reason for it (or heapam_scan_analyze_next_tuple()) to take
a TableScanDesc instead of a HeapScanDesc.

> */
> - hscan->rs_cblock = blockno;
> - hscan->rs_cindex = FirstOffsetNumber;
> - hscan->rs_cbuf = ReadBufferExtended(scan->rs_rd, MAIN_FORKNUM,
> - blockno, RBM_NORMAL, bstrategy);
> + hscan->rs_cbuf = read_stream_next_buffer(stream, NULL);
> + if (hscan->rs_cbuf == InvalidBuffer)
> + return false;
> +
> LockBuffer(hscan->rs_cbuf, BUFFER_LOCK_SHARE);
> +
> + hscan->rs_cblock = BufferGetBlockNumber(hscan->rs_cbuf);
> + hscan->rs_cindex = FirstOffsetNumber;
> + return true;
> }

> /*
> diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
> index 2fb39f3ede1..764520d5aa2 100644
> --- a/src/backend/commands/analyze.c
> +++ b/src/backend/commands/analyze.c
> @@ -1102,6 +1102,20 @@ examine_attribute(Relation onerel, int attnum, Node *index_expr)
> return stats;
> }
>
> +/*
> + * Prefetch callback function to get next block number while using
> + * BlockSampling algorithm
> + */
> +static BlockNumber
> +block_sampling_streaming_read_next(ReadStream *stream,
> + void *user_data,
> + void *per_buffer_data)
> +{
> + BlockSamplerData *bs = user_data;
> +
> + return BlockSampler_HasMore(bs) ? BlockSampler_Next(bs) : InvalidBlockNumber;

I don't see the point of BlockSampler_HasMore() anymore. I removed it in
the attached and made BlockSampler_Next() return InvalidBlockNumber
under the same conditions. Is there a reason not to do this? There
aren't other callers. If the BlockSampler_Next() wasn't part of an API,
we could just make it the streaming read callback, but that might be
weird as it is now.

That and my other ideas in attached. Let me know what you think.

- Melanie

Attachment Content-Type Size
v7-0001-Use-streaming-read-API-in-ANALYZE.patch text/x-diff 7.3 KB
v7-0002-some-ideas.patch text/x-diff 8.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-04-03 21:15:59 Re: On disable_cost
Previous Message Regina Obe 2024-04-03 20:34:48 RE: Can't compile PG 17 (master) from git under Msys2 autoconf