Re: weird hash plan cost, starting with pg10

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: weird hash plan cost, starting with pg10
Date: 2020-03-25 10:36:17
Message-ID: CAMbWs4-oR+V1AtR7j8Nv85-n1eEng6Z4NmHvcJLEpsb69x5JMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 24, 2020 at 3:36 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:

> On Tue, Mar 24, 2020 at 11:05 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
> wrote:
>
>>
>> I think there might be a case like this:
>>
>> * ExecRescanHashJoin() decides it can't reuse the hash table for a
>> rescan, so it calls ExecHashTableDestroy(), clears HashJoinState's
>> hj_HashTable and sets hj_JoinState to HJ_BUILD_HASHTABLE
>> * the HashState node still has a reference to the pfree'd HashJoinTable!
>> * HJ_BUILD_HASHTABLE case reaches the empty-outer optimisation case so
>> it doesn't bother to build a new hash table
>> * EXPLAIN examines the HashState's pointer to a freed HashJoinTable struct
>>
>
> Yes, debugging with gdb shows this is exactly what happens.
>

According to the scenario above, here is a recipe that reproduces this
issue.

-- recipe start
create table a(i int, j int);
create table b(i int, j int);
create table c(i int, j int);

insert into a select 3,3;
insert into a select 2,2;
insert into a select 1,1;

insert into b select 3,3;

insert into c select 0,0;

analyze a;
analyze b;
analyze c;

set enable_nestloop to off;
set enable_mergejoin to off;

explain analyze
select exists(select * from b join c on a.i > c.i and a.i = b.i and b.j =
c.j) from a;
-- recipe end

I tried this recipe on different PostgreSQL versions, starting from
current master and going backwards. I was able to reproduce this issue
on all versions above 8.4. In 8.4 version, we do not output information
on hash buckets/batches. But manual inspection with gdb shows in 8.4 we
also have the dangling pointer for HashState->hashtable. I didn't check
versions below 8.4 though.

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sergei Kornilov 2020-03-25 10:42:56 Re: replay pause vs. standby promotion
Previous Message John Naylor 2020-03-25 10:32:05 Re: truncating timestamps on arbitrary intervals