Re: Hash Indexes

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash Indexes
Date: 2016-09-15 05:41:41
Message-ID: CAA4eK1Kc+7GTSRWKFQCtz6Ya7_e9qHH-syOJX5hhfBR+5mdVNg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 15, 2016 at 4:04 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Tue, Sep 13, 2016 at 9:31 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
>>
>> =======
>>
>> +Vacuum acquires cleanup lock on bucket to remove the dead tuples and or
>> tuples
>> +that are moved due to split. The need for cleanup lock to remove dead
>> tuples
>> +is to ensure that scans' returns correct results. Scan that returns
>> multiple
>> +tuples from the same bucket page always restart the scan from the
>> previous
>> +offset number from which it has returned last tuple.
>>
>> Perhaps it would be better to teach scans to restart anywhere on the page,
>> than to force more cleanup locks to be taken?
>
>
> Commenting on one of my own questions:
>
> This won't work when the vacuum removes the tuple which an existing scan is
> currently examining and thus will be used to re-find it's position when it
> realizes it is not visible and so takes up the scan again.
>
> The index tuples in a page are stored sorted just by hash value, not by the
> combination of (hash value, tid). If they were sorted by both, we could
> re-find our position even if the tuple had been removed, because we would
> know to start at the slot adjacent to where the missing tuple would be were
> it not removed. But unless we are willing to break pg_upgrade, there is no
> feasible way to change that now.
>

I think it is possible without breaking pg_upgrade, if we match all
items of a page at once (and save them as local copy), rather than
matching item-by-item as we do now. We are already doing similar for
btree, refer explanation of BTScanPosItem and BTScanPosData in
nbtree.h.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-09-15 05:48:57 Re: Hash Indexes
Previous Message Amit Kapila 2016-09-15 05:32:35 Re: Hash Indexes