Re: Use streaming read API in ANALYZE

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, jakub(dot)wartak(at)enterprisedb(dot)com, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Use streaming read API in ANALYZE
Date: 2024-04-07 20:59:26
Message-ID: CAAKRu_Z+GTgZo0SuPSnVAb6HyHgMOsmdwboB9kzDwBr7i3nV3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 7, 2024 at 3:57 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Thu, Apr 04, 2024 at 02:03:30PM +0300, Nazir Bilal Yavuz wrote:
> >
> > On Wed, 3 Apr 2024 at 23:44, Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:
> > >
> > > I don't see the point of BlockSampler_HasMore() anymore. I removed it in
> > > the attached and made BlockSampler_Next() return InvalidBlockNumber
> > > under the same conditions. Is there a reason not to do this? There
> > > aren't other callers. If the BlockSampler_Next() wasn't part of an API,
> > > we could just make it the streaming read callback, but that might be
> > > weird as it is now.
> >
> > I agree. There is no reason to have BlockSampler_HasMore() after
> > streaming read API changes.
> >
> > > That and my other ideas in attached. Let me know what you think.
> >
> > I agree with your changes but I am not sure if others agree with all
> > the changes you have proposed. So, I didn't merge 0001 and your ideas
> > yet, instead I wrote a commit message, added some comments, changed ->
> > 'if (bs->t >= bs->N || bs->m >= bs->n)' to 'if (K <= 0 || k <= 0)' and
> > attached it as 0002.
>
> I couldn't quite let go of those changes to acquire_sample_rows(), so
> attached v9 0001 implements them as a preliminary patch before your
> analyze streaming read user. I inlined heapam_scan_analyze_next_block()
> entirely and made heapam_scan_analyze_next_tuple() a static function in
> commands/analyze.c (and tweaked the name).
>
> I made a few tweaks to your patch since it is on top of those changes
> instead of preceding them. Then 0003 is removing BlockSampler_HasMore()
> since it doesn't make sense to remove it before the streaming read user
> was added.

I realized there were a few outdated comments. Fixed in attached v10.

- Melanie

Attachment Content-Type Size
v10-0001-Make-heapam_scan_analyze_next_-tuple-block-stati.patch text/x-patch 17.9 KB
v10-0002-Use-streaming-read-API-in-ANALYZE.patch text/x-patch 5.4 KB
v10-0003-Obsolete-BlockSampler_HasMore.patch text/x-patch 2.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nazir Bilal Yavuz 2024-04-07 21:01:26 Re: Streaming I/O, vectored I/O (WIP)
Previous Message Tomas Vondra 2024-04-07 20:54:23 Re: Add bump memory context type and use it for tuplesorts