Re: BitmapHeapScan streaming read user and prelim refactoring

From: Melanie Plageman <melanieplageman(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2024-03-02 22:28:03
Message-ID: CAAKRu_ZOd1RbM6rj4_RxhzChz-Vp9roWg+B67vmKiKJCwn=hZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 2, 2024 at 10:05 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> Here's a PDF with charts for a dataset where the row selectivity is more
> correlated to selectivity of pages. I'm attaching the updated script,
> with the SQL generating the data set. But the short story is all rows on
> a single page have the same random value, so the selectivity of rows and
> pages should be the same.
>
> The first page has results for the original "uniform", the second page
> is the new "uniform-pages" data set. There are 4 charts, for
> master/patched and 0/4 parallel workers. Overall the behavior is the
> same, but for the "uniform-pages" it's much more gradual (with respect
> to row selectivity). I think that's expected.

Cool! Thanks for doing this. I have convinced myself that Thomas'
forthcoming patch which will eliminate prefetching with eic = 0 will
fix the eic 0 blue line regressions. The eic = 1 with four parallel
workers is more confusing. And it seems more noticeably bad with your
randomized-pages dataset.

Regarding your earlier question:

> Just to be sure we're on the same page regarding what eic=1 means,
> consider a simple sequence of pages: A, B, C, D, E, ...
>
> With the current "master" code, eic=1 means we'll issue a prefetch for B
> and then read+process A. And then issue prefetch for C and read+process
> B, and so on. It's always one page ahead.

Yes, that is what I mean for eic = 1

> As for how this is related to eic=1 - I think my point was that these
> are "adversary" data sets, most likely to show regressions. This applies
> especially to the "uniform" data set, because as the row selectivity
> grows, it's more and more likely it's right after to the current one,
> and so a read-ahead would likely do the trick.

No, I think you're right that eic=1 should prefetch. As you say, with
high selectivity, a bitmap plan is likely not the best one anyway, so
not prefetching in order to preserve the performance of those cases
seems silly.

- Melanie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-03-02 22:29:52 Re: Failures in constraints regression test, "read only 0 of 8192 bytes"
Previous Message Melanie Plageman 2024-03-02 22:11:11 Re: BitmapHeapScan streaming read user and prelim refactoring