Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patch for hash index

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Shubham Barai <shubhambaraiss(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Borodin <amborodin86(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patch for hash index
Date: 2018-03-06 05:56:06
Message-ID: CAA4eK1+fesG1K_4dynNCaF5d-X28ytepyRHKODzyqT4jzuK_JA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 5, 2018 at 8:58 AM, Thomas Munro
<thomas(dot)munro(at)enterprisedb(dot)com> wrote:
> On Sun, Mar 4, 2018 at 12:53 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>> Yes, but I think it would be better if we call this once we are sure
>> that at least one tuple from the old bucket has been transferred
>> (consider if all tuples in the old bucket are dead). Apart from this,
>> I think this patch has missed handling the cases where we scan the
>> buckets when the split is in progress. In such cases, we scan both
>> old and new bucket, so I think we need to ensure that we take
>> PredicateLock on both the buckets during such scans.
>
> Hmm. Yeah.
>
> So, in _hash_first(), do you think we might just need this?
>
> if (H_BUCKET_BEING_POPULATED(opaque))
> {
> ...
> old_blkno = _hash_get_oldblock_from_newbucket(rel, bucket);
> ...
> old_buf = _hash_getbuf(rel, old_blkno, HASH_READ, LH_BUCKET_PAGE);
> + PredicateLockPage(rel, BufferGetBlockNumber(old_buf),
> scan->xs_snapshot);
> TestForOldSnapshot(scan->xs_snapshot, rel, BufferGetPage(old_buf));
>
> That is, if you begin scanning a 'new' bucket, we remember the old
> bucket and go and scan that too, so we'd better predicate-lock both up
> front (or I suppose we could do it later when we visit that page, but
> here it can be done in a single place).
>

Yeah, that can work, but I am slightly worried that we might actually
never scan the old bucket (say for queries with Limit clause) in which
case it might give false positives for insertions in old buckets.

> What if we begin scanning an 'old' bucket that is being split? I
> think we'll only do that for tuples that actually belong in the old
> bucket after the split, so no need to double-lock? And I don't think
> a split could begin while we are scanning. Do I have that right?
>

Right.

> As for inserting, I'm not sure if any special treatment is needed, as
> long as the scan code path (above) and the split code path are
> correct. I'm not sure though.
>

I also don't think we need any special handling for insertions.

> I'm wondering how to test all this. I'm thinking about a program that
> repeatedly creates a hash index and then slowly adds more things to it
> so that buckets split (maybe using distinct keys carefully crafted to
> hit the same bucket?), while concurrently hammering it with a ton of
> scans and then ... somehow checking correctness...
>

Yeah, that will generate the required errors, but not sure how to
verify correctness. One idea could be that when the split is in
progress, we somehow stop it in-between (say by cancel request) and
then run targeted selects and inserts on the bucket being scanned and
bucket being populated.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2018-03-06 06:14:00 Re: pg_get_functiondef forgets about most GUC_LIST_INPUT GUCs
Previous Message Yura Sokolov 2018-03-06 05:23:57 Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian a bit)