Re: Bug in batch tuplesort memory CLUSTER case (9.6 only)

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Noah Misch <noah(at)leadboat(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bug in batch tuplesort memory CLUSTER case (9.6 only)
Date: 2016-07-01 19:30:40
Message-ID: CAM3SWZS_1F=ut+CtOt7inai_6qYdytzhxvZqR3uRDOj0qASEEQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 1, 2016 at 12:10 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I could give you steps to reproduce the bug, but they involve creating
>> a large table using my gensort tool [1]. It isn't trivial. Are you
>> interested?
>
> The bug can't very well be so simple that you need not include a set
> of steps to reproduce it and, at the same time, so complex that even
> so much as reading the list of steps to reproduce it might be more
> than I want to do.

Reading and following are two different things. I don't think reading
alone will do much good with the following steps, but here they are:

Checkout my gensort tool from github. Build the C tool with "make".
Then, from the working directory:

./postgres_load.py -m 250 --skew --logged
psql -c "CREATE INDEX segfaults on sort_test_skew(sortkey);"
psql -c "CLUSTER sort_test_skew USING segfaults;"

That test case isn't at all minimal, but that's how I happened upon
the bug. I didn't tell you this before now because I assumed that
you'd just accept that there was an omission made based on a quick
reading of the code. This is not a complicated bug; the pointer
HeapTuple.t_data needs to be updated when tuples are moved around in
memory, but isn't. readtup_cluster() initializes that field like this:

/* Reconstruct the HeapTupleData header */
tuple->t_data = (HeapTupleHeader) ((char *) tuple + HEAPTUPLESIZE);

More or less the same process needs to occur following any movement of
the tuple. Whereas, in all other cases there is no need to do
something similar, as it happens.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andreas Seltenreich 2016-07-01 19:31:46 [sqlsmith] ERROR: plan should not reference subplan's variable
Previous Message Andres Freund 2016-07-01 19:23:51 Re: Reviewing freeze map code