Re: weird hash plan cost, starting with pg10

From: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>
To: Richard Guo <guofenglinux(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: weird hash plan cost, starting with pg10
Date: 2020-03-25 14:29:59
Message-ID: 4751e4df-5f98-657e-fb88-c039dd1e7a31@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 25.03.2020 13:36, Richard Guo wrote:
>
> On Tue, Mar 24, 2020 at 3:36 PM Richard Guo <guofenglinux(at)gmail(dot)com
> <mailto:guofenglinux(at)gmail(dot)com>> wrote:
>
> On Tue, Mar 24, 2020 at 11:05 AM Thomas Munro
> <thomas(dot)munro(at)gmail(dot)com <mailto:thomas(dot)munro(at)gmail(dot)com>> wrote:
>
>
> I think there might be a case like this:
>
> * ExecRescanHashJoin() decides it can't reuse the hash table for a
> rescan, so it calls ExecHashTableDestroy(), clears HashJoinState's
> hj_HashTable and sets hj_JoinState to HJ_BUILD_HASHTABLE
> * the HashState node still has a reference to the pfree'd
> HashJoinTable!
> * HJ_BUILD_HASHTABLE case reaches the empty-outer optimisation
> case so
> it doesn't bother to build a new hash table
> * EXPLAIN examines the HashState's pointer to a freed
> HashJoinTable struct
>
>
> Yes, debugging with gdb shows this is exactly what happens.
>
>
> According to the scenario above, here is a recipe that reproduces this
> issue.
>
> -- recipe start
> create table a(i int, j int);
> create table b(i int, j int);
> create table c(i int, j int);
>
> insert into a select 3,3;
> insert into a select 2,2;
> insert into a select 1,1;
>
> insert into b select 3,3;
>
> insert into c select 0,0;
>
> analyze a;
> analyze b;
> analyze c;
>
> set enable_nestloop to off;
> set enable_mergejoin to off;
>
> explain analyze
> select exists(select * from b join c on a.i > c.i and a.i = b.i and
> b.j = c.j) from a;
> -- recipe end
>
> I tried this recipe on different PostgreSQL versions, starting from
> current master and going backwards. I was able to reproduce this issue
> on all versions above 8.4. In 8.4 version, we do not output information
> on hash buckets/batches. But manual inspection with gdb shows in 8.4 we
> also have the dangling pointer for HashState->hashtable. I didn't check
> versions below 8.4 though.
>
> Thanks
> Richard

I can propose the following patch for the problem.

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
hash_join_instrumentation.patch text/x-patch 1.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pengzhou Tang 2020-03-25 14:35:32 Re: Parallel grouping sets
Previous Message Andy Fan 2020-03-25 14:24:42 Re: [PATCH] Keeps tracking the uniqueness with UniqueKey