Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, "Todd A(dot) Cook" <tcook(at)blackducksoftware(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop
Date: 2017-11-28 04:03:03
Message-ID: 17357.1511841783@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com> writes:
> When SH_INSERT tries to insert that final extra value, insertdist
> keeps exceeding SH_GROW_MAX_DIB (25) no matter how many times we
> double the size (at least until my computer gives up, somewhere around
> 11 doublings and 75GB of virtual memory). If you set SH_GROW_MAX_DIB
> to 26 then it succeeds, but I guess some other attack could be crafted
> for that. What is the theory behind this parameter?

You beat me to it --- after looking at simplehash.h I'd guessed that
either the SH_GROW_MAX_DIB or SH_GROW_MAX_MOVE code path was causing
an infinite loop, but I'd not gotten to determining which one yet.

I'd ask what's the theory behind SH_GROW_MAX_MOVE, as well. Neither
of them are obviously loop-proof.

Note that the sample data has a lot of collisions:

regression=# select hashint8(val), count(*) from reproducer group by 1 order by 2 desc;
hashint8 | count
-------------+-------
441526644 | 2337
-1117776826 | 1221
-1202007016 | 935
-2068831050 | 620
1156644653 | 538
553783815 | 510
259780770 | 444
371047036 | 394
915722575 | 359
... etc etc ...

It's evidently more complicated than just that the code fails with
more than SH_GROW_MAX_DIB duplicate hashcodes, but I suspect not
by a lot. There needs to be a safety valve that prevents letting
the hash fill factor approach zero, which is what's happening in
this test case.

(I wonder whether these loops oughtn't contain CHECK_FOR_INTERRUPTS,
btw.)

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Thomas Munro 2017-11-28 04:51:50 Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop
Previous Message Thomas Munro 2017-11-28 02:20:40 Re: BUG #14932: SELECT DISTINCT val FROM table gets stuck in an infinite loop