Re: Hash Indexes

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hash Indexes
Date: 2016-07-14 11:03:47
Message-ID: CAA4eK1JOc-rUXG1vR3d3bfSYnM3QDcKiScPvs9rTqRq-rHnQtw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 22, 2016 at 8:48 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Wed, Jun 22, 2016 at 5:14 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> We can do it in the way as you are suggesting, but there is another thing
>> which we need to consider here. As of now, the patch tries to finish the
>> split if it finds split-in-progress flag in either old or new bucket. We
>> need to lock both old and new buckets to finish the split, so it is quite
>> possible that two different backends try to lock them in opposite order
>> leading to a deadlock. I think the correct way to handle is to always try
>> to lock the old bucket first and then new bucket. To achieve that, if the
>> insertion on new bucket finds that split-in-progress flag is set on a
>> bucket, it needs to release the lock and then acquire the lock first on old
>> bucket, ensure pincount is 1 and then lock new bucket again and ensure that
>> pincount is 1. I have already maintained the order of locks in scan (old
>> bucket first and then new bucket; refer changes in _hash_first()).
>> Alternatively, we can try to finish the splits only when someone tries to
>> insert in old bucket.
>
> Yes, I think locking buckets in increasing order is a good solution.
> I also think it's fine to only try to finish the split when the insert
> targets the old bucket. Finishing the split enables us to remove
> tuples from the old bucket, which lets us reuse space instead of
> accelerating more. So there is at least some potential benefit to the
> backend inserting into the old bucket. On the other hand, a process
> inserting into the new bucket derives no direct benefit from finishing
> the split.
>

Okay, following this suggestion, I have updated the patch so that only
insertion into old-bucket can try to finish the splits. Apart from
that, I have fixed the issue reported by Mithun upthread. I have
updated the README to explain the locking used in patch. Also, I
have changed the locking around vacuum, so that it can work with
concurrent scans when ever possible. In previous patch version,
vacuum used to take cleanup lock on a bucket to remove the dead
tuples, moved-due-to-split tuples and squeeze operation, also it holds
the lock on bucket till end of cleanup. Now, also it takes cleanup
lock on a bucket to out-wait scans, but it releases the lock as it
proceeds to clean the overflow pages. The idea is first we need to
lock the next bucket page and then release the lock on current bucket
page. This ensures that any concurrent scan started after we start
cleaning the bucket will always be behind the cleanup. Allowing scans
to cross vacuum will allow it to remove tuples required for sanctity
of scan. Also for squeeze-phase we are just checking if the pincount
of buffer is one (we already have Exclusive lock on buffer of bucket
by that time), then only proceed, else will try to squeeze next time
the cleanup is required for that bucket.

Thoughts/Suggestions?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
concurrent_hash_index_v3.patch application/octet-stream 98.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2016-07-14 11:40:57 Re: Oddity in handling of cached plans for FDW queries
Previous Message Geoff Winkless 2016-07-14 10:38:51 Re: unexpected psql "feature"