Re: BitmapHeapScan streaming read user and prelim refactoring

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Melanie Plageman <melanieplageman(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Andres Freund <andres(at)anarazel(dot)de>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring
Date: 2024-04-07 14:41:59
Message-ID: aa3e7e5d-a452-4df0-b928-d9765b2d7088@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/7/24 16:24, Melanie Plageman wrote:
> On Sun, Apr 7, 2024 at 7:38 AM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>>
>>
>> On 4/7/24 06:17, Melanie Plageman wrote:
>>> On Sun, Apr 07, 2024 at 02:27:43AM +0200, Tomas Vondra wrote:
>>>> On 4/6/24 23:34, Melanie Plageman wrote:
>>>>> ...
>>>>>>
>>>>>> I realized it makes more sense to add a FIXME (I used XXX. I'm not when
>>>>>> to use what) with a link to the message where Andres describes why he
>>>>>> thinks it is a bug. If we plan on fixing it, it is good to have a record
>>>>>> of that. And it makes it easier to put a clear and accurate comment.
>>>>>> Done in 0009.
>>>>>>
>>>>>>> OK, thanks. If think 0001-0008 are ready to go, with some minor tweaks
>>>>>>> per above (tuple vs. tuples etc.), and the question about the recheck
>>>>>>> flag. If you can do these tweaks, I'll get that committed today and we
>>>>>>> can try to get a couple more patches in tomorrow.
>>>>>
>>>>> Attached v19 rebases the rest of the commits from v17 over the first
>>>>> nine patches from v18. All patches 0001-0009 are unchanged from v18. I
>>>>> have made updates and done cleanup on 0010-0021.
>>>>>
>>>>
>>>> I've pushed 0001-0005, I'll get back to this tomorrow and see how much
>>>> more we can get in for v17.
>>>
>>> Thanks! I thought about it a bit more, and I got worried about the
>>>
>>> Assert(scan->rs_empty_tuples_pending == 0);
>>>
>>> in heap_rescan() and heap_endscan().
>>>
>>> I was worried if we don't complete the scan it could end up tripping
>>> incorrectly.
>>>
>>> I tried to come up with a query which didn't end up emitting all of the
>>> tuples on the page (using a LIMIT clause), but I struggled to come up
>>> with an example that qualified for the skip fetch optimization and also
>>> returned before completing the scan.
>>>
>>> I could work a bit harder tomorrow to try and come up with something.
>>> However, I think it might be safer to just change these to:
>>>
>>> scan->rs_empty_tuples_pending = 0
>>>
>>
>> Hmmm, good point. I haven't tried, but wouldn't something like "SELECT 1
>> FROM t WHERE column = X LIMIT 1" do the trick? Probably in a join, as a
>> correlated subquery?
>
> Unfortunately (or fortunately, I guess) that exact thing won't work
> because even constant values in the target list disqualify it for the
> skip fetch optimization.
>
> Being a bit too lazy to look at planner code this morning, I removed
> the target list requirement like this:
>
> - need_tuples = (node->ss.ps.plan->qual != NIL ||
> - node->ss.ps.plan->targetlist != NIL);
> + need_tuples = (node->ss.ps.plan->qual != NIL);
>
> And can easily trip the assert with this:
>
> create table foo (a int);
> insert into foo select i from generate_series(1,10)i;
> create index on foo(a);
> vacuum foo;
> select 1 from (select 2 from foo limit 3);
>
> Anyway, I don't know if we could find a query that does actually hit
> this. The only bitmap heap scan queries in the regress suite that meet
> the
> BitmapHeapScanState->ss.ps.plan->targetlist == NIL
> condition are aggregates (all are count(*)).
>
> I'll dig a bit more later, but do you think this is worth adding an
> open item for? Even though I don't have a repro yet?
>

Try this:

create table t (a int, b int) with (fillfactor=10);
insert into t select mod((i/22),2), (i/22) from generate_series(0,1000)
S(i);
create index on t(a);
vacuum analyze t;

set enable_indexonlyscan = off;
set enable_seqscan = off;
explain (analyze, verbose) select 1 from (values (1)) s(x) where exists
(select * from t where a = x);

KABOOM!

#2 0x000078a16ac5fafe in __GI_raise (sig=sig(at)entry=6) at
../sysdeps/posix/raise.c:26
#3 0x000078a16ac4887f in __GI_abort () at abort.c:79
#4 0x0000000000bb2c5a in ExceptionalCondition (conditionName=0xc42ba8
"scan->rs_empty_tuples_pending == 0", fileName=0xc429c8 "heapam.c",
lineNumber=1090) at assert.c:66
#5 0x00000000004f68bb in heap_endscan (sscan=0x19af3a0) at heapam.c:1090
#6 0x000000000077a94c in table_endscan (scan=0x19af3a0) at
../../../src/include/access/tableam.h:1001

So yeah, this assert is not quite correct. It's not breaking anything at
the moment, so we can fix it now or add it as an open item.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2024-04-07 14:52:05 Re: Cluster::restart dumping logs when stop fails
Previous Message Melanie Plageman 2024-04-07 14:38:46 Re: BitmapHeapScan streaming read user and prelim refactoring