Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Daniel Newman <dtnewman(at)gmail(dot)com>, danielnewman(at)umich(dot)edu, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #14210: filter by "=" constraint doesn't work when hash index is present on a column
Date: 2016-07-16 02:21:06
Message-ID: CAM3SWZTBAo4hjbBd780+MrOKiKp_TMo1N3A0Rw9_im8gbD7fQA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Mon, Jun 27, 2016 at 1:22 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Peter Geoghegan <pg(at)heroku(dot)com> writes:
>> I like this idea. Should I write a patch?
>
> BTW, while you're at it: it strikes me that the threshold should be
> either min(NBuffers, maintenance_work_mem) or
> min(NLocBuffer, maintenance_work_mem), depending on whether we're
> talking about a regular or temp table/index. That is, there's a
> pre-existing bug here that when NLocBuffer is a good deal less than
> NBuffers (which is the typical case nowadays) you'll get a lot of
> thrashing between local buffers and kernel cache, if the index isn't
> quite big enough to trigger the sorting code. This might not manifest
> as actual I/O, but it's not the intended behavior either.

Finally found time for this. Attached patch tests hash tuplesorts
along the lines we discussed. It adds one new tuplesort operation,
which does not spill to disk. It also asserts that hash values
retrieved through the tuplesort interface are in fact in sorted order.
I wanted to have something reliably trip when comparetup_index_hash()
gives wrong answers, even when a corrupt final index is perhaps not a
consequence of the underlying bug.

Due to how the hash build code works, maintenance_work_mem will need
to be 2 blocks smaller than the final index size in order to force a
sort (see code comments). I think that this does not matter, since
none of this is documented currently.

Having reviewed the coverage report for tuplesort.c [1], I think that
the only notable testing omission once this latest patch is committed
will be that there is no coverage for "backwards get tuple" cases
(this is more noticeable when looking at coverage statistics for
logtape.c, actually). This seems less important to me, and would be
awkward to test in any case, given my constraints.

[1] http://coverage.postgresql.org/src/backend/utils/sort/tuplesort.c.gcov.html
--
Peter Geoghegan

Attachment Content-Type Size
0001-Add-regression-test-cases-for-hash-tuplesorts.patch text/x-patch 8.9 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Amit Kapila 2016-07-16 12:55:08 Re: Invalid indexes should not consume update overhead
Previous Message Charles 2016-07-15 18:48:21 Using row-level security results in less optimal query plans