Re: BitmapHeapScan streaming read user and prelim refactoring

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2024-04-06 13:40:11
Message-ID: 20240406134011.hjo3e2336sgcghfp@liskov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 06, 2024 at 02:51:45AM +0200, Tomas Vondra wrote:
>
>
> On 4/6/24 01:53, Melanie Plageman wrote:
> > On Fri, Apr 05, 2024 at 04:06:34AM -0400, Melanie Plageman wrote:
> >> On Thu, Apr 04, 2024 at 04:35:45PM +0200, Tomas Vondra wrote:
> >>> On 4/4/24 00:57, Melanie Plageman wrote:
> >>>> On Sun, Mar 31, 2024 at 11:45:51AM -0400, Melanie Plageman wrote:
> >>> I'd focus on the first ~8-9 commits or so for now, we can commit more if
> >>> things go reasonably well.
> >>
> >> Sounds good. I will spend cleanup time on 0010-0013 tomorrow but would
> >> love to know if you agree with the direction before I spend more time.
> >
> > In attached v16, I've split out 0010-0013 into 0011-0017. I think it is
> > much easier to understand.
> >
>
> Anyway, I've attached it as .tgz in order to not confuse cfbot. All the
> review comments are marked with XXX, so grep for that in the patches.
> There's two separate patches - the first one suggests a code change, so
> it was better to not merge that with your code. The second has just a
> couple XXX comments, I'm not sure why I kept it separate.
>
> A couple review comments:
>
> * I think 0001-0009 are 99% ready to. I reworded some of the commit
> messages a bit - I realize it's a bit bold, considering you're native
> speaker and I'm not. If you could check I didn't make it worse, that
> would be great.

Attached v17 has *only* patches 0001-0009 with these changes. I will
work on applying the remaining patches, addressing feedback, and adding
comments next.

I have reviewed and incorporated all of your feedback on these patches.
Attached v17 is your exact patches with 1 or 2 *very* slight tweaks to
commit messages (comma splice removal and single word adjustments) as
well as the changes listed below:

I have changed the following:

- 0003 added an assert that rs_empty_tuples_pending is 0 on rescan and
endscan

- 0004 (your 0005)-- I followed up with Tom, but for now I have just
removed the XXX and also reworded the message a bit

- 0006 (your 0007) fixed up the variable name (you changed valid ->
valid_block but it had gotten changed back)

I have open questions on the following:

- 0003: should it be SO_NEED_TUPLES and need_tuples (instead of
SO_NEED_TUPLE and need_tuple)?

- 0009 (your 0010)
- Should I mention in the commit message that we added blockno and
pfblockno in the BitmapHeapScanState only for validation or is that
too specific?

- Should I mention that a future (imminent) commit will remove the
iterators from TableScanDescData and put them in HeapScanDescData? I
imagine folks don't want those there, but it is easier for the
progression of commits to put them there first and then move them

- I'm worried this comment is vague and or potentially not totally
correct. Should we remove it? I don't think we have conclusive proof
that this is true.
/*
* Adjusting the prefetch iterator before invoking
* table_scan_bitmap_next_block() keeps prefetch distance higher across
* the parallel workers.
*/

> * I'm not sure extra_flags is the right way to pass the flag in 0003.
> The "extra_" name is a bit weird, and no other table AM functions do it
> this way and pass explicit bool flags instead. So my first "review"
> commit does it like that. Do you agree it's better that way?

Yes.

> * The one question I'm somewhat unsure about is why Tom chose to use the
> "wrong" recheck flag in the 2017 commit, when the correct recheck flag
> is readily available. Surely that had a reason, right? But I can't think
> of one ...

See above.

> > While I was doing that, I realized that I should remove the call to
> > table_rescan() from ExecReScanBitmapHeapScan() and just rely on the new
> > table_rescan_bm() invoked from BitmapHeapNext(). That is done in the
> > attached.
> >
> > 0010-0018 still need comments updated but I focused on getting the split
> > out, reviewable version of them ready. I'll add comments (especially to
> > 0011 table AM functions) tomorrow. I also have to double-check if I
> > should add any asserts for table AMs about having implemented all of the
> > new begin/re/endscan() functions.
> >
>
> I added a couple more comments for those patches (10-12). Chances are
> the split in v16 clarifies some of my questions, but it'll have to wait
> till the morning ...

Will address this in next mail.

- Melanie

Attachment Content-Type Size
v17-0001-BitmapHeapScan-begin-scan-after-bitmap-creation.patch text/x-diff 2.5 KB
v17-0002-BitmapHeapScan-postpone-setting-can_skip_fetch.patch text/x-diff 2.4 KB
v17-0003-Push-BitmapHeapScan-skip-fetch-optimization-into.patch text/x-diff 14.7 KB
v17-0004-Use-prefetch-block-recheck-value-to-determine-sk.patch text/x-diff 2.5 KB
v17-0005-Change-BitmapAdjustPrefetchIterator-to-accept-Bl.patch text/x-diff 2.5 KB
v17-0006-BitmapHeapScan-Reduce-scope-of-tbmiterator-local.patch text/x-diff 3.1 KB
v17-0007-table_scan_bitmap_next_block-counts-lossy-and-ex.patch text/x-diff 5.0 KB
v17-0008-Remove-table_scan_bitmap_next_tuple-parameter-tb.patch text/x-diff 4.1 KB
v17-0009-Make-table_scan_bitmap_next_block-async-friendly.patch text/x-diff 22.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-04-06 14:04:45 Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?
Previous Message David Rowley 2024-04-06 12:51:03 Re: Flushing large data immediately in pqcomm