Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patch for hash index

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Shubham Barai <shubhambaraiss(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Borodin <amborodin86(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patch for hash index
Date: 2018-03-10 07:53:56
Message-ID: CAA4eK1KNbJOFQ-+v6f+0NeqEEqE6+1pZr9Aemt4MFAUW_-DAYQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 10, 2018 at 1:10 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Tue, Mar 6, 2018 at 11:26 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Mon, Mar 5, 2018 at 8:58 AM, Thomas Munro
>> <thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>>> On Sun, Mar 4, 2018 at 12:53 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>>> Yes, but I think it would be better if we call this once we are sure
>>>> that at least one tuple from the old bucket has been transferred
>>>> (consider if all tuples in the old bucket are dead). Apart from this,
>>>> I think this patch has missed handling the cases where we scan the
>>>> buckets when the split is in progress. In such cases, we scan both
>>>> old and new bucket, so I think we need to ensure that we take
>>>> PredicateLock on both the buckets during such scans.
>>>
>>> Hmm. Yeah.
>>>
>>> So, in _hash_first(), do you think we might just need this?
>>>
>>> if (H_BUCKET_BEING_POPULATED(opaque))
>>> {
>>> ...
>>> old_blkno = _hash_get_oldblock_from_newbucket(rel, bucket);
>>> ...
>>> old_buf = _hash_getbuf(rel, old_blkno, HASH_READ, LH_BUCKET_PAGE);
>>> + PredicateLockPage(rel, BufferGetBlockNumber(old_buf),
>>> scan->xs_snapshot);
>>> TestForOldSnapshot(scan->xs_snapshot, rel, BufferGetPage(old_buf));
>>>
>>> That is, if you begin scanning a 'new' bucket, we remember the old
>>> bucket and go and scan that too, so we'd better predicate-lock both up
>>> front (or I suppose we could do it later when we visit that page, but
>>> here it can be done in a single place).
>>>
>>
>> Yeah, that can work, but I am slightly worried that we might actually
>> never scan the old bucket (say for queries with Limit clause) in which
>> case it might give false positives for insertions in old buckets.
>>
>
> I have changed the patch to address this point by acquiring predicate
> lock in _hash_readnext where it will acquire the lock only when it
> tries to scan the old bucket. I have also addressed another problem
> related to transfer of predicate locks during split such that it will
> transfer locks only when there is any tuple transferred from old to
> the new bucket.
>

Added some additional text in README in the attached patch to explain
the new change in mechanism.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
Predicate-Locking-in-hash-index_v7.patch application/octet-stream 28.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-03-10 08:14:32 Re: Fixes for missing schema qualifications
Previous Message Amit Kapila 2018-03-10 07:40:28 Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patch for hash index