Re: BitmapHeapScan streaming read user and prelim refactoring

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2024-03-28 21:43:42
Message-ID: 7e98781c-34e0-4c02-8e5b-0500aa074271@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 3/27/24 20:37, Melanie Plageman wrote:
> On Mon, Mar 25, 2024 at 12:07:09PM -0400, Melanie Plageman wrote:
>> On Sun, Mar 24, 2024 at 06:37:20PM -0400, Melanie Plageman wrote:
>>> On Sun, Mar 24, 2024 at 5:59 PM Tomas Vondra
>>> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>>>
>>>> BTW when you say "up to 'Make table_scan_bitmap_next_block() async
>>>> friendly'" do you mean including that patch, or that this is the first
>>>> patch that is not one of the independently useful patches.
>>>
>>> I think the code is easier to understand with "Make
>>> table_scan_bitmap_next_block() async friendly". Prior to that commit,
>>> table_scan_bitmap_next_block() could return false even when the bitmap
>>> has more blocks and expects the caller to handle this and invoke it
>>> again. I think that interface is very confusing. The downside of the
>>> code in that state is that the code for prefetching is still in the
>>> BitmapHeapNext() code and the code for getting the current block is in
>>> the heap AM-specific code. I took a stab at fixing this in v9's 0013,
>>> but the outcome wasn't very attractive.
>>>
>>> What I will do tomorrow is reorder and group the commits such that all
>>> of the commits that are useful independent of streaming read are first
>>> (I think 0014 and 0015 are independently valuable but they are on top
>>> of some things that are only useful to streaming read because they are
>>> more recently requested changes). I think I can actually do a bit of
>>> simplification in terms of how many commits there are and what is in
>>> each. Just to be clear, v9 is still reviewable. I am just going to go
>>> back and change what is included in each commit.
>>
>> So, attached v10 does not include the new version of streaming read API.
>> I focused instead on the refactoring patches commit regrouping I
>> mentioned here.
>
> Attached v11 has the updated Read Stream API Thomas sent this morning
> [1]. No other changes.
>

I think there's some sort of bug, triggering this assert in heapam

Assert(BufferGetBlockNumber(hscan->rs_cbuf) == tbmres->blockno);

I haven't looked for the root cause, and it's not exactly deterministic,
but try this:

create table t (a int, b text);

insert into t select 10000 * random(), md5(i::text)
from generate_series(1,10000000) s(i);^C

create index on t (a);

explain analyze select * from t where a = 200;
explain analyze select * from t where a < 200;

and then vary the condition a bit (different values, inequalities,
etc.). For me it hits the assert in a couple tries.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
bt.txt text/plain 22.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2024-03-28 21:48:28 Re: pg_upgrade --copy-file-range
Previous Message Nathan Bossart 2024-03-28 21:38:54 Re: Popcount optimization using AVX512