Re: BitmapHeapScan streaming read user and prelim refactoring

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2024-02-14 04:34:20
Message-ID: 8DE07ADE-1D86-4680-8793-057FAAE35B5C@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Feb 13, 2024, at 3:11 PM, Melanie Plageman <melanieplageman(at)gmail(dot)com> wrote:

Thanks for the patch...

> Attached is a patch set which refactors BitmapHeapScan such that it
> can use the streaming read API [1]. It also resolves the long-standing
> FIXME in the BitmapHeapScan code suggesting that the skip fetch
> optimization should be pushed into the table AMs. Additionally, it
> moves table scan initialization to after the index scan and bitmap
> initialization.
>
> patches 0001-0002 are assorted cleanup needed later in the set.
> patches 0003 moves the table scan initialization to after bitmap creation
> patch 0004 is, I think, a bug fix. see [2].
> patches 0005-0006 push the skip fetch optimization into the table AMs
> patches 0007-0009 change the control flow of BitmapHeapNext() to match
> that required by the streaming read API
> patch 0010 is the streaming read code not yet in master
> patch 0011 is the actual bitmapheapscan streaming read user.
>
> patches 0001-0009 apply on top of master but 0010 and 0011 must be
> applied on top of a commit before a 21d9c3ee4ef74e2 (until a rebased
> version of the streaming read API is on the mailing list).

I followed your lead and applied them to 6a8ffe812d194ba6f4f26791b6388a4837d17d6c. `make check` worked fine, though I expect you know that already.

> The caveat is that these patches introduce breaking changes to two
> table AM functions for bitmapheapscan: table_scan_bitmap_next_block()
> and table_scan_bitmap_next_tuple().

You might want an independent perspective on how much of a hassle those breaking changes are, so I took a stab at that. Having written a custom proprietary TAM for postgresql 15 here at EDB, and having ported it and released it for postgresql 16, I thought I'd try porting it to the the above commit with your patches. Even without your patches, I already see breaking changes coming from commit f691f5b80a85c66d715b4340ffabb503eb19393e, which creates a similar amount of breakage for me as does your patches. Dealing with the combined breakage might amount to a day of work, including testing, half of which I think I've already finished. In other words, it doesn't seem like a big deal.

Were postgresql 17 shaping up to be compatible with TAMs written for 16, your patch would change that qualitatively, but since things are already incompatible, I think you're in the clear.

> A TBMIterateResult used to be threaded through both of these functions
> and used in BitmapHeapNext(). This patch set removes all references to
> TBMIterateResults from BitmapHeapNext. Because the streaming read API
> requires the callback to specify the next block, BitmapHeapNext() can
> no longer pass a TBMIterateResult to table_scan_bitmap_next_block().
>
> More subtly, table_scan_bitmap_next_block() used to return false if
> there were no more visible tuples on the page or if the block that was
> requested was not valid. With these changes,
> table_scan_bitmap_next_block() will only return false when the bitmap
> has been exhausted and the scan can end. In order to use the streaming
> read API, the user must be able to request the blocks it needs without
> requiring synchronous feedback per block. Thus, this table AM function
> must change its meaning.
>
> I think the way the patches are split up could be improved. I will
> think more about this. There are also probably a few mistakes with
> which comments are updated in which patches in the set.

I look forward to the next version of the patch set. Thanks again for working on this.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2024-02-14 05:00:00 Re: SQL:2011 application time
Previous Message Andrei Lepikhov 2024-02-14 04:19:59 Re: Memory consumed by child SpecialJoinInfo in partitionwise join planning