Re: weird hash plan cost, starting with pg10

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: weird hash plan cost, starting with pg10
Date: 2020-04-13 09:07:41
Message-ID: CAMbWs49UkT4pyPHSX=26DV+ucCAL1K7pPJo6F-C7ejRYC8eSkA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 11, 2020 at 4:11 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Konstantin Knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru> writes:
> > On 25.03.2020 13:36, Richard Guo wrote:
> >> I tried this recipe on different PostgreSQL versions, starting from
> >> current master and going backwards. I was able to reproduce this issue
> >> on all versions above 8.4. In 8.4 version, we do not output information
> >> on hash buckets/batches. But manual inspection with gdb shows in 8.4 we
> >> also have the dangling pointer for HashState->hashtable. I didn't check
> >> versions below 8.4 though.
>
> > I can propose the following patch for the problem.
>
> I looked at this patch a bit, and I don't think it goes far enough.
> What this issue is really pointing out is that EXPLAIN is not considering
> the possibility of a Hash node having had several hashtable instantiations
> over its lifespan. I propose what we do about that is generalize the
> policy that show_hash_info() is already implementing (in a rather half
> baked way) for multiple workers, and report the maximum field values
> across all instantiations. We can combine the code needed to do so
> with the code for the parallelism case, as shown in the 0001 patch
> below.
>

I looked through 0001 patch and it looks good to me.

At first I was wondering if we need to check whether HashState.hashtable
is not NULL in ExecShutdownHash() before we decide to allocate save
space for HashState.hinstrument. And then I convinced myself that that's
not necessary since HashState.hinstrument and HashState.hashtable cannot
be both NULL there.

>
> In principle we could probably get away with back-patching 0001,
> at least into branches that already have the HashState.hinstrument
> pointer. I'm not sure it's worth any risk though. A much simpler
> fix is to make sure we clear the dangling hashtable pointer, as in
> 0002 below (a simplified form of Konstantin's patch). The net
> effect of that is that in the case where a hash table is destroyed
> and never rebuilt, EXPLAIN ANALYZE would report no hash stats,
> rather than possibly-garbage stats like it does today. That's
> probably good enough, because it should be an uncommon corner case.
>

Yes it's an uncommon corner case. But I think it may still surprise
people that most of the time the hash stat shows well but sometimes it
does not.

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-04-13 09:24:53 Re: Vacuum o/p with (full 1, parallel 0) option throwing an error
Previous Message Kyotaro Horiguchi 2020-04-13 08:49:24 Re: pg_basebackup, manifests and backends older than ~12