Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patch for hash index

From: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Shubham Barai <shubhambaraiss(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Borodin <amborodin86(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patch for hash index
Date: 2018-02-13 06:47:25
Message-ID: CAEepm=2h8JDu0Ym_XYJ57S+4MOL+c1qredP9-ByUHwV9M_4b3w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 13, 2018 at 4:28 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Sun, Jan 28, 2018 at 7:28 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> On Thu, Jan 25, 2018 at 7:29 PM, Alexander Korotkov
>> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
>>>> As Shubham seems to be running out of time, I thought of helping him
>>>> by looking into the above-suggested idea. I think one way to lock a
>>>> particular hash value is we can use TID of heap tuple associated with
>>>> each index entry (index entry for the hash index will be hash value).
>>>
>>>
>>> Sorry, I didn't get what do you particularly mean. If locking either TID of
>>> associated heap tuple or TID of hash index tuple, then what will we lock
>>> in the case when nothing found? Even if we found nothing, we have
>>> to place some lock according to search key in order to detect cases when
>>> somebody has inserted the row which we might see according to that search
>>> key.
>>>
>>
>> Okay, but if you use hash value as lock tag (which is possible) how
>> will we deal with things like page split? I think even if use
>> blocknumber/page/bucketnumber corresponding to the hash value along
>> with hash value in lock tag, then also it doesn't appear to work. I
>> think using page level locks for index makes sense, especially because
>> it will be convinient to deal with page splits.
>>
>
> What I intend to say here is that we already have a mechanism like
> PredicateLockPageSplit() which can deal with predicate locks during
> page split if we operate at page level. However, if we want to go for
> has value locking technique, it can be quite complex and expensive to
> make it work as we have to search all the locks that belong to the
> bucket being split and then move them for the new bucket.

True.

One way to avoid all that might be to use pseudo page numbers that
don't suffer from splits. I don't know how you'd choose the
constant, but it could be something like pseudo page number = hash
value % 1024. In other words, you'd use the full hash value for the
'tuple' part of the predicate lock tag, and a shorter hash value for
the 'page' part of the predicate lock tag, so you'd never have to
handle split, and promotion from fine grained 'tuple' (= hash value)
locks to coarse grained 'page' = (short hash value) locks would still
work automatically when space runs out.

> Alexander/Thomas, do you have better ideas to make it work, otherwise,
> I think we can proceed to review the Shubham's current approach/patch?

I think we should proceed to review the current patch. As far as I
can see, adding more fine-grained locking would remain a possibility
for future improvement. With this patch we get page-level hash index
SIREAD locks, and perhaps in a future patch we could add fine grained
hash value SIREAD locks (maybe as described above, if that works...)

--
Thomas Munro
http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rafia Sabih 2018-02-13 07:07:58 Re: [HACKERS] Partition-wise aggregation/grouping
Previous Message Andres Freund 2018-02-13 06:28:37 Re: A space-efficient, user-friendly way to store categorical data