Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop

From: "Todd A(dot) Cook" <tcook(at)blackducksoftware(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop
Date: 2017-12-06 16:58:04
Message-ID: 1773c302-2fb1-749f-8ccc-62b23617c5ac@blackducksoftware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 12/06/17 08:33, Tomas Vondra wrote:
>
>>> Can you share the data, so that whoever fixes the bug can verify it also
>>> fixes your example?
>>
>> Sure.  It's attached.
>>
>
> Seems the dataset has pretty much the same issue as the one reported
> before, that is
>
> select hashint8(val), count(distinct val), count(*) from temp_f_03 group
> by 1 order by 2 desc;
>
> hashint8 | count | count
> -------------+-------+-------
> -1971396144 | 45 | 45
> 2035896403 | 42 | 42
> -1633843397 | 30 | 30
> 1425704662 | 29 | 29
> -455482779 | 22 | 22
> -300163565 | 17 | 17
> -1803963420 | 17 | 17
> -537082846 | 14 | 14
> 603707034 | 13 | 13
> -176369887 | 12 | 12
> 1274957136 | 11 | 11
> 1465522632 | 11 | 11
> -1589862230 | 10 | 10
> -1145403239 | 10 | 10
>
> i.e. there are many hash collisions (more than in the other data set).

If hashint8() is ultimately invoked by TupleHashTableHash() in execGroups.c,
it might be magnifying the difficulties here. The least significant bits,
which are used as the bucket number in simplehash.h, are not very well
distributed:

select val, to_hex(val), to_hex(hashint8(val)) from temp_f_03 limit 15 ;
val | to_hex | to_hex
----------------------+------------------+----------
4444319256653758784 | 3dad64d121468140 | 805ffffe
554179993563924608 | 7b0d7c49a018880 | 84dffffb
-3383965646518123872 | d109bd2c6b2982a0 | 9c3ffff7
-4706811054739454944 | beae0c48915f8420 | 191ffff6
618200668902031424 | 8944a3ba5d08040 | 2a3ffff0
5074043922812601024 | 466aa01079f982c0 | 7effffee
-8783188184262212928 | 861bd8e1b9a482c0 | a6bfffea
-4597800992953433792 | c031545b6b128140 | b1dfffea
8563040839807173408 | 76d608465dde8320 | 7d9fffe6
6092569112843158816 | 548d27540c888520 | 6f9fffe6
-7313351060369079936 | 9a81c1f558f98180 | 68ffffe5
-1786712428165627488 | e7345283536981a0 | 73ffffe5
-6153596242570280896 | aa9a08d20e6b8040 | ac3fffd8
88426174078092128 | 13a272306c58360 | b57fffd8
-5305589938458295680 | b65ec20faa4e8280 | ba9fffd3

-- todd

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2017-12-06 17:14:24 Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop
Previous Message Tom Lane 2017-12-06 16:40:13 Re: BUG #14949: array_append() - performance issues (in update)