|From:||Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>|
|To:||Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>|
|Cc:||Robert Haas <robertmhaas(at)gmail(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||Re: Page Scan Mode in Hash Index|
|Views:||Raw Message | Whole Thread | Download mbox|
Thanks for the review.
>>> I am suspicious that _hash_kill_items() is going to have problems if
>>> the overflow page is freed before it reacquires the lock.
>>> _btkillitems() contains safeguards against similar cases.
>> I have added similar check for hash_kill_items() as well.
> I think hash_kill_items has a much bigger problem which is that we
> can't kill the items if the page has been modified after re-reading
> it. Consider a case where Vacuum starts before the Scan on the bucket
> and then Scan went ahead (which is possible after your 0003 patch) and
> noted the killed items in killed item array and before it could kill
> all the items, Vacuum remove those items. Now it is quite possible
> that before scan tries to kill those items, the corresponding itemids
> have been used by different tuples. I think what we need to do here
> is to store the LSN of page first time when we have read the page and
> then compare it with current page lsn after re-reading the page in
Okay, understood. I have done the changes to handle this type of
scenario. Please refer to the attached patches. Thanks.
> + HashScanPosItem items[MaxIndexTuplesPerPage]; /* MUST BE LAST */
> +} HashScanPosData;
> HashScanPosItem *killedItems; /* tids and offset numbers of killed items */
> int numKilled; /* number of currently stored items */
> + /*
> + * Identify all the matching items on a page and save them
> + * in HashScanPosData
> + */
> + HashScanPosData currPos; /* current position data */
> } HashScanOpaqueData;
> After having array of HashScanPosItems as currPos.items, I think you
> don't need array of HashScanPosItem for killedItems, just an integer
> array of index in currPos.items should be sufficient.
> I think below para in README is not valid after this patch.
> "To keep concurrency reasonably good, we require readers to cope with
> concurrent insertions, which means that they have to be able to
> their current scan position after re-acquiring the buffer content lock
> on page. Since deletion is not possible while a reader holds the pin
> on bucket, and we assume that heap tuple TIDs are unique, this can be
> implemented by searching for the same heap tuple TID previously
> returned. Insertion does not move index entries across pages, so the
> previously-returned index entry should always be on the same page, at
> the same or higher offset number,
> as it was before."
I have modified above paragraph in README file.
> - next_buf = _hash_getbuf_with_strategy(rel, blkno, HASH_WRITE,
> - LH_OVERFLOW_PAGE,
> - bstrategy);
> - * release the lock on previous page after acquiring the lock on next
> - * page
> + * As the hash index scan work in page at a time mode,
> + * vacuum can release the lock on previous page before
> + * acquiring lock on the next page.
> + next_buf = _hash_getbuf_with_strategy(rel, blkno, HASH_WRITE,
> + LH_OVERFLOW_PAGE,
> + bstrategy);
> After this change, you need to modify comments on top of function
> hashbucketcleanup() and _hash_squeezebucket().
Please note that these patches needs to be applied on top of .
|Next Message||Jeff Davis||2017-04-02 00:36:42||New expression evaluator and indirect jumps|
|Previous Message||Kevin Grittner||2017-04-01 22:00:14||Re: GSoC 2017 Proposal for "Explicitly support predicate locks in index access methods besides btree"|