Re: BUG #14722: Segfault in tuplesort_heap_siftup, 32 bit overflow

From: Sergey Koposov <skoposov(at)cmu(dot)edu>
To: "pg(at)bowt(dot)ie" <pg(at)bowt(dot)ie>
Cc: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14722: Segfault in tuplesort_heap_siftup, 32 bit overflow
Date: 2017-07-05 19:06:19
Lists: pgsql-bugs

On Thu, 2017-06-29 at 10:00 -0700, Peter Geoghegan wrote:
> On Thu, Jun 29, 2017 at 9:16 AM, <skoposov(at)cmu(dot)edu> wrote:
> > From a quick look of the code it looks to me that the reason for the bug is
> > the 32 bit int overflow in the j=2*i+1 calculation inside the
> > tuplesort_heap_siftup leading to negative values of j.
> It seems likely that the explanation is as simple as that. This
> happens during run generation with replacement selection. All versions
> are affected, but version 9.6+ is dramatically less likely to be
> affected, because replacement selection was all but killed in Postgres
> 9.6.
> This is an oversight in commit 263865a. The fix is to use a variable
> that won't overflow in tuplesort_heap_siftup() -- this is probably a
> one-liner, because when the variable overflows today, the correct
> behavior would be for control to break out of the loop that declares
> the overflowing variable "j", and, I don't see any similar problem in
> other heap maintenance routines. It's a very isolated problem.
> I could write a patch.

Just to avoid being forgotten, I attach a trivial patch against 9.5
branch as well as have created a commitfest submission

The script below allows to reproduce the bug (segfault) and test that
the patch fixes it: (>~70 GB of RAM are needed and 100+GB of disk
create table xx3 as
select generate_series as a
from generate_series(0,(1.5*((1::bigint)<<31))::bigint);
set maintenance_work_mem to '70GB';
create index on xx3(a);

Hopefully somebody can take care of patching other PG branches.


