Re: BitmapHeapScan streaming read user and prelim refactoring

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2024-03-01 14:05:49
Message-ID: 0c7c0673-d012-41d1-8e76-31cc4a1b1eec@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/1/24 02:18, Melanie Plageman wrote:
> On Thu, Feb 29, 2024 at 6:44 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> On 2/29/24 23:44, Tomas Vondra wrote:
>>>
>>> ...
>>>
>>>>>
>>>>> I do have some partial results, comparing the patches. I only ran one of
>>>>> the more affected workloads (cyclic) on the xeon, attached is a PDF
>>>>> comparing master and the 0001-0014 patches. The percentages are timing
>>>>> vs. the preceding patch (green - faster, red - slower).
>>>>
>>>> Just confirming: the results are for uncached?
>>>>
>>>
>>> Yes, cyclic data set, uncached case. I picked this because it seemed
>>> like one of the most affected cases. Do you want me to test some other
>>> cases too?
>>>
>>
>> BTW I decided to look at the data from a slightly different angle and
>> compare the behavior with increasing effective_io_concurrency. Attached
>> are charts for three "uncached" cases:
>>
>> * uniform, work_mem=4MB, workers_per_gather=0
>> * linear-fuzz, work_mem=4MB, workers_per_gather=0
>> * uniform, work_mem=4MB, workers_per_gather=4
>>
>> Each page has charts for master and patched build (with all patches). I
>> think there's a pretty obvious difference in how increasing e_i_c
>> affects the two builds:
>
> Wow! These visualizations make it exceptionally clear. I want to go to
> the Vondra school of data visualizations for performance results!
>

Welcome to my lecture on how to visualize data. The process has about
four simple steps:

1) collect data for a lot of potentially interesting cases
2) load them into excel / google sheets / ...
3) slice and dice them into charts that you understand / can explain
4) every now and then there's something you can't understand / explain

Thank you for attending my lecture ;-) No homework today.

>> 1) On master there's clear difference between eic=0 and eic=1 cases, but
>> on the patched build there's literally no difference - for example the
>> "uniform" distribution is clearly not great for prefetching, but eic=0
>> regresses to eic=1 poor behavior).
>
> Yes, so eic=0 and eic=1 are identical with the streaming read API.
> That is, eic 0 does not disable prefetching. Thomas is going to update
> the streaming read API to avoid issuing an fadvise for the last block
> in a range before issuing a read -- which would mean no prefetching
> with eic 0 and eic 1. Not doing prefetching with eic 1 actually seems
> like the right behavior -- which would be different than what master
> is doing, right?
>

I don't think we should stop doing prefetching for eic=1, or at least
not based just on these charts. I suspect these "uniform" charts are not
a great example for the prefetching, because it's about distribution of
individual rows, and even a small fraction of rows may match most of the
pages. It's great for finding strange behaviors / corner cases, but
probably not a sufficient reason to change the default.

I think it makes sense to issue a prefetch one page ahead, before
reading/processing the preceding one, and it's fairly conservative
setting, and I assume the default was chosen for a reason / after
discussion.

My suggestion would be to keep the master behavior unless not practical,
and then maybe discuss changing the details later. The patch is already
complicated enough, better to leave that discussion for later.

> Hopefully this fixes the clear difference between master and the
> patched version at eic 0.
>
>> 2) For some reason, the prefetching with eic>1 perform much better with
>> the patches, except for with very low selectivity values (close to 0%).
>> Not sure why this is happening - either the overhead is much lower
>> (which would matter on these "adversarial" data distribution, but how
>> could that be when fadvise is not free), or it ends up not doing any
>> prefetching (but then what about (1)?).
>
> For the uniform with four parallel workers, eic == 0 being worse than
> master makes sense for the above reason. But I'm not totally sure why
> eic == 1 would be worse with the patch than with master. Both are
> doing a (somewhat useless) prefetch.
>

Right.

> With very low selectivity, you are less likely to get readahead
> (right?) and similarly less likely to be able to build up > 8kB IOs --
> which is one of the main value propositions of the streaming read
> code. I imagine that this larger read benefit is part of why the
> performance is better at higher selectivities with the patch. This
> might be a silly experiment, but we could try decreasing
> MAX_BUFFERS_PER_TRANSFER on the patched version and see if the
> performance gains go away.
>

Sure, I can do that. Do you have any particular suggestion what value to
use for MAX_BUFFERS_PER_TRANSFER?

I'll also try to add a better version of uniform, where the selectivity
matches more closely to pages, not rows.

>> 3) I'm not sure about the linear-fuzz case, the only explanation I have
>> we're able to skip almost all of the prefetches (and read-ahead likely
>> works pretty well here).
>
> I started looking at the data generated by linear-fuzz to understand
> exactly what effect the fuzz was having but haven't had time to really
> understand the characteristics of this dataset. In the original
> results, I thought uncached linear-fuzz and linear had similar results
> (performance improvement from master). What do you expect with linear
> vs linear-fuzz?
>

I don't know, TBH. My intent was to have a data set with correlated
data, either perfectly (linear) or with some noise (linear-fuzz). But
it's not like I spent too much thinking about it. It's more a case of
throwing stuff at the wall, seeing what sticks.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-03-01 14:12:40 Re: pread, pwrite, etc return ssize_t not int
Previous Message Alvaro Herrera 2024-03-01 14:00:49 Re: make BuiltinTrancheNames less ugly