Re: BitmapHeapScan streaming read user and prelim refactoring

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2024-04-05 08:06:34
Message-ID: 20240405080634.d7c23f46xjhhxf5q@liskov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 04, 2024 at 04:35:45PM +0200, Tomas Vondra wrote:
>
>
> On 4/4/24 00:57, Melanie Plageman wrote:
> > On Sun, Mar 31, 2024 at 11:45:51AM -0400, Melanie Plageman wrote:
> >> On Fri, Mar 29, 2024 at 12:05:15PM +0100, Tomas Vondra wrote:
> >>>
> >>>
> >>> On 3/29/24 02:12, Thomas Munro wrote:
> >>>> On Fri, Mar 29, 2024 at 10:43 AM Tomas Vondra
> >>>> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> >>>>> I think there's some sort of bug, triggering this assert in heapam
> >>>>>
> >>>>> Assert(BufferGetBlockNumber(hscan->rs_cbuf) == tbmres->blockno);
> >>>>
> >>>> Thanks for the repro. I can't seem to reproduce it (still trying) but
> >>>> I assume this is with Melanie's v11 patch set which had
> >>>> v11-0016-v10-Read-Stream-API.patch.
> >>>>
> >>>> Would you mind removing that commit and instead applying the v13
> >>>> stream_read.c patches[1]? v10 stream_read.c was a little confused
> >>>> about random I/O combining, which I fixed with a small adjustment to
> >>>> the conditions for the "if" statement right at the end of
> >>>> read_stream_look_ahead(). Sorry about that. The fixed version, with
> >>>> eic=4, with your test query using WHERE a < a, ends its scan with:
> >>>>
> >>>
> >>> I'll give that a try. Unfortunately unfortunately the v11 still has the
> >>> problem I reported about a week ago:
> >>>
> >>> ERROR: prefetch and main iterators are out of sync
> >>>
> >>> So I can't run the full benchmarks :-( but master vs. streaming read API
> >>> should work, I think.
> >>
> >> Odd, I didn't notice you reporting this ERROR popping up. Now that I
> >> take a look, v11 (at least, maybe also v10) had this very sill mistake:
> >>
> >>   if (scan->bm_parallel == NULL &&
> >>       scan->rs_pf_bhs_iterator &&
> >>       hscan->pfblockno > hscan->rs_base.blockno)
> >>        elog(ERROR, "prefetch and main iterators are out of sync");
> >>
> >> It errors out if the prefetch block is ahead of the current block --
> >> which is the opposite of what we want. I've fixed this in attached v12.
> >>
> >> This version also has v13 of the streaming read API. I noticed one
> >> mistake in my bitmapheap scan streaming read user -- it freed the
> >> streaming read object at the wrong time. I don't know if this was
> >> causing any other issues, but it at least is fixed in this version.
> >
> > Attached v13 is rebased over master (which includes the streaming read
> > API now). I also reset the streaming read object on rescan instead of
> > creating a new one each time.
> >
> > I don't know how much chance any of this has of going in to 17 now, but
> > I thought I would start looking into the regression repro Tomas provided
> > in [1].
> >
>
> My personal opinion is that we should try to get in as many of the the
> refactoring patches as possible, but I think it's probably too late for
> the actual switch to the streaming API.

Cool. In the attached v15, I have dropped all commits that are related
to the streaming read API and included *only* commits that are
beneficial to master. A few of the commits are merged or reordered as
well.

While going through the commits with this new goal in mind (forget about
the streaming read API for now), I realized that it doesn't make much
sense to just eliminate the layering violation for the current block and
leave it there for the prefetch block. I had de-prioritized solving this
when I thought we would just delete the prefetch code and replace it
with the streaming read.

Now that we aren't doing that, I've spent the day trying to resolve the
issues with pushing the prefetch code into heapam.c that I cited in [1].
0010 - 0013 are the result of this. They are not very polished yet and
need more cleanup and review (especially 0011, which is probably too
large), but I am happy with the solution I came up with.

Basically, there are too many members needed for bitmap heap scan to put
them all in the HeapScanDescData (don't want to bloat it). So, I've made
a new BitmapHeapScanDescData and associated begin/rescan/end() functions

In the end, with all patches applied, BitmapHeapNext() loops invoking
table_scan_bitmap_next_tuple() and table AMs can implement that however
they choose.

> > I'm also not sure if I should try and group the commits into fewer
> > commits now or wait until I have some idea of whether or not the
> > approach in 0013 and 0014 is worth pursuing.
> >
>
> You mean whether to pursue the approach in general, or for v17? I think
> it looks like the right approach, but for v17 see above :-(
>
> As for merging, I wouldn't do that. I looked at the commits and while
> some of them seem somewhat "trivial", I really like how you organized
> the commits, and kept those that just "move" code around, and those that
> actually change stuff. It's much easier to understand, IMO.
>
> I went through the first ~10 commits, and added some review - either as
> a separate commit, when possible, in the code as XXX comment, and also
> in the commit message. The code tweaks are utterly trivial (whitespace
> or indentation to make the line shorter). It shouldn't take much time to
> deal with those, I think.

Attached v15 incorporates your v14-0002-review.

For your v14-0008-review, I actually ended up removing that commit
because once I removed everything that was for streaming read API, it
became redundant with another commit.

For your v14-0010-review, we actually can't easily get rid of those
local variables because we make the iterators before we make the scan
descriptors and the commit following that commit moves the iterators
from the BitmapHeapScanState to the scan descriptor.

> I think the main focus should be updating the commit messages. If it was
> only a single patch, I'd probably try to write the messages myself, but
> with this many patches it'd be great if you could update those and I'll
> review that before commit.

I did my best to update the commit messages to be less specific and more
focused on "why should I care". I found myself wanting to explain why I
implemented something the way I did and then getting back into the
implementation details again. I'm not sure if I suceeded in having less
details and more substance.

> I always struggle with writing commit messages myself, and it takes me
> ages to write a good one (well, I think the message is good, but who
> knows ...). But I think a good message should be concise enough to
> explain what and why it's done. It may reference a thread for all the
> gory details, but the basic reasoning should be in the commit message.
> For example the message for "BitmapPrefetch use prefetch block recheck
> for skip fetch" now says that it "makes more sense to do X" but does not
> really say why that's the case. The linked message does, but it'd be
> good to have that in the message (because how would I know how much of
> the thread to read?).

I fixed that particular one. I tried to take that feedback and apply it
to other commit messages. I don't know how successful I was...

> Also, it'd be very helpful if you could update the author & reviewed-by
> fields. I'll review those before commit, ofc, but I admit I lost track
> of who reviewed which part.

I have updated reviewers. I didn't add reviewers on the ones that
haven't really been reviewed yet.

> I'd focus on the first ~8-9 commits or so for now, we can commit more if
> things go reasonably well.

Sounds good. I will spend cleanup time on 0010-0013 tomorrow but would
love to know if you agree with the direction before I spend more time.

- Melanie

[1] https://www.postgresql.org/message-id/20240323002211.on5vb5ulk6lsdb2u%40liskov

Attachment Content-Type Size
v15-0001-BitmapHeapScan-begin-scan-after-bitmap-creation.patch text/x-diff 2.4 KB
v15-0002-BitmapHeapScan-set-can_skip_fetch-later.patch text/x-diff 2.3 KB
v15-0003-Push-BitmapHeapScan-skip-fetch-optimization-into.patch text/x-diff 14.7 KB
v15-0004-BitmapPrefetch-use-prefetch-block-recheck-for-sk.patch text/x-diff 2.4 KB
v15-0005-Update-BitmapAdjustPrefetchIterator-parameter-ty.patch text/x-diff 2.5 KB
v15-0006-Reduce-scope-of-BitmapHeapScan-tbmiterator-local.patch text/x-diff 3.1 KB
v15-0007-table_scan_bitmap_next_block-counts-lossy-and-ex.patch text/x-diff 5.1 KB
v15-0008-Remove-table_scan_bitmap_next_tuple-parameter-tb.patch text/x-diff 4.1 KB
v15-0009-Make-table_scan_bitmap_next_block-async-friendly.patch text/x-diff 22.4 KB
v15-0010-Unify-parallel-and-serial-BitmapHeapScan-iterato.patch text/x-diff 16.7 KB
v15-0011-Push-BitmapHeapScan-prefetch-code-into-heapam.c.patch text/x-diff 47.7 KB
v15-0012-Move-BitmapHeapScan-initialization-to-helper.patch text/x-diff 6.9 KB
v15-0013-Remove-table_scan_bitmap_next_block.patch text/x-diff 12.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2024-04-05 08:09:53 Re: remaining sql/json patches
Previous Message Alexander Lakhin 2024-04-05 08:00:00 Re: remaining sql/json patches