Re: BitmapHeapScan streaming read user and prelim refactoring

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2024-03-02 23:59:19
Message-ID: ed18f73d-216d-42f4-b08a-04a50a802d62@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/1/24 17:51, Melanie Plageman wrote:
> On Fri, Mar 1, 2024 at 9:05 AM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> On 3/1/24 02:18, Melanie Plageman wrote:
>>> On Thu, Feb 29, 2024 at 6:44 PM Tomas Vondra
>>> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>>>
>>>> On 2/29/24 23:44, Tomas Vondra wrote:
>>>> 1) On master there's clear difference between eic=0 and eic=1 cases, but
>>>> on the patched build there's literally no difference - for example the
>>>> "uniform" distribution is clearly not great for prefetching, but eic=0
>>>> regresses to eic=1 poor behavior).
>>>
>>> Yes, so eic=0 and eic=1 are identical with the streaming read API.
>>> That is, eic 0 does not disable prefetching. Thomas is going to update
>>> the streaming read API to avoid issuing an fadvise for the last block
>>> in a range before issuing a read -- which would mean no prefetching
>>> with eic 0 and eic 1. Not doing prefetching with eic 1 actually seems
>>> like the right behavior -- which would be different than what master
>>> is doing, right?
>>
>> I don't think we should stop doing prefetching for eic=1, or at least
>> not based just on these charts. I suspect these "uniform" charts are not
>> a great example for the prefetching, because it's about distribution of
>> individual rows, and even a small fraction of rows may match most of the
>> pages. It's great for finding strange behaviors / corner cases, but
>> probably not a sufficient reason to change the default.
>
> Yes, I would like to see results from a data set where selectivity is
> more correlated to pages/heap fetches. But, I'm not sure I see how
> that is related to prefetching when eic = 1.
>
>> I think it makes sense to issue a prefetch one page ahead, before
>> reading/processing the preceding one, and it's fairly conservative
>> setting, and I assume the default was chosen for a reason / after
>> discussion.
>
> Yes, I suppose the overhead of an fadvise does not compare to the IO
> latency of synchronously reading that block. Actually, I bet the
> regression I saw by accidentally moving BitmapAdjustPrefetchIterator()
> after table_scan_bitmap_next_block() would be similar to the
> regression introduced by making eic = 1 not prefetch.
>
> When you think about IO concurrency = 1, it doesn't imply prefetching
> to me. But, I think we want to do the right thing and have parity with
> master.
>
>> My suggestion would be to keep the master behavior unless not practical,
>> and then maybe discuss changing the details later. The patch is already
>> complicated enough, better to leave that discussion for later.
>
> Agreed. Speaking of which, we need to add back use of tablespace IO
> concurrency for the streaming read API (which is used by
> BitmapHeapScan in master).
>
>>> With very low selectivity, you are less likely to get readahead
>>> (right?) and similarly less likely to be able to build up > 8kB IOs --
>>> which is one of the main value propositions of the streaming read
>>> code. I imagine that this larger read benefit is part of why the
>>> performance is better at higher selectivities with the patch. This
>>> might be a silly experiment, but we could try decreasing
>>> MAX_BUFFERS_PER_TRANSFER on the patched version and see if the
>>> performance gains go away.
>>
>> Sure, I can do that. Do you have any particular suggestion what value to
>> use for MAX_BUFFERS_PER_TRANSFER?
>
> I think setting it to 1 would be the same as always master -- doing
> only 8kB reads. The only thing about that is that I imagine the other
> streaming read code has some overhead which might end up being a
> regression on balance even with the prefetching if we aren't actually
> using the ranges/vectored capabilities of the streaming read
> interface. Maybe if you just run it for one of the very obvious
> performance improvement cases? I can also try this locally.
>

Here's some results from a build with

#define MAX_BUFFERS_PER_TRANSFER 1

There are three columns:

- master
- patched (original patches, with MAX_BUFFERS_PER_TRANSFER=128kB)
- patched-single (MAX_BUFFERS_PER_TRANSFER=8kB)

The color scales are always branch compared to master.

I think the expectation was that setting the transfer to 1 would make it
closer to master, reducing some of the regressions. But in practice the
effect is the opposite.

- In "cached" runs, this eliminates the small improvements (light
green), but leaves the regressions behind.

- In "uncached" runs, this exacerbates the regressions, particularly for
low selectivities (small values of matches).

I don't have a good intuition on why this would be happening :-(

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
cached-transfer-reduced.pdf application/pdf 192.7 KB
uncached-transfer-reduced.pdf application/pdf 211.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2024-03-02 23:59:40 Re: pub/sub - specifying optional parameters without values.
Previous Message Peter Smith 2024-03-02 23:46:59 Re: Synchronizing slots from primary to standby