Re: Page Scan Mode in Hash Index

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Page Scan Mode in Hash Index
Date: 2017-04-04 10:26:26
Message-ID: CAA4eK1JrePM0_ai64Dt5WAiaERET=Kuow-FyOGLDaNLSzBF6rw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 2, 2017 at 4:14 AM, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
>
> Please note that these patches needs to be applied on top of [1].
>

Few more review comments:

1.
- page = BufferGetPage(so->hashso_curbuf);
+ blkno = so->currPos.currPage;
+ if (so->hashso_bucket_buf == so->currPos.buf)
+ {
+ buf = so->currPos.buf;
+ LockBuffer(buf, BUFFER_LOCK_SHARE);
+ page = BufferGetPage(buf);
+ }

Here, you are assuming that only bucket page can be pinned, but I
think that assumption is not right. When _hash_kill_items() is called
before moving to next page, there could be a pin on the overflow page.
You need some logic to check if the buffer is pinned, then just lock
it. I think once you do that, it might not be convinient to release
the pin at the end of this function.

Add some comments on top of _hash_kill_items to explain the new
changes or say some thing like "See _bt_killitems for details"

2.
+ /*
+ * We save the LSN of the page as we read it, so that we know whether it
+ * safe to apply LP_DEAD hints to the page later. This allows us to drop
+ * the pin for MVCC scans, which allows vacuum to avoid blocking.
+ */
+ so->currPos.lsn = PageGetLSN(page);
+

The second part of above comment doesn't make sense because vacuum's
will still be blocked because we hold the pin on primary bucket page.

3.
+ {
+ /*
+ * No more matching tuples were found. return FALSE
+ * indicating the same. Also, remember the prev and
+ * next block number so that if fetching tuples using
+ * cursor we remember the page from where to start the
+ * scan.
+ */
+ so->currPos.prevPage = (opaque)->hasho_prevblkno;
+ so->currPos.nextPage = (opaque)->hasho_nextblkno;

You can't read opaque without having lock and by this time it has
already been released. Also, I think if you want to save position for
cursor movement, then you need to save the position of last bucket
when scan completes the overflow chain, however as you have written it
will be always invalid block number. I think there is similar problem
with prevblock number.

4.
+_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber offnum,
+ OffsetNumber maxoff, ScanDirection dir)
+{
+ HashScanOpaque so = (HashScanOpaque) scan->opaque;
+ IndexTuple itup;
+ int itemIndex;
+
+ if (ScanDirectionIsForward(dir))
+ {
+ /* load items[] in ascending order */
+ itemIndex = 0;
+
+ /* new page, relocate starting position by binary search */
+ offnum = _hash_binsearch(page, so->hashso_sk_hash);

What is the need to find offset number when this function already has
that as an input parameter?

5.
+_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber offnum,
+ OffsetNumber maxoff, ScanDirection dir)

I think maxoff is not required to be passed a parameter to this
function as you need it only for forward scan. You can compute it
locally.

6.
+_hash_load_qualified_items(IndexScanDesc scan, Page page, OffsetNumber offnum,
+ OffsetNumber maxoff, ScanDirection dir)
{
..
+ if (ScanDirectionIsForward(dir))
+ {
..
+ while (offnum <= maxoff)
{
..
+ if (so->hashso_sk_hash == _hash_get_indextuple_hashkey(itup) &&
+ _hash_checkqual(scan, itup))
+ {
+ /* tuple is qualified, so remember it */
+ _hash_saveitem(so, itemIndex, offnum, itup);
+ itemIndex++;
+ }
+
+ offnum = OffsetNumberNext(offnum);
..

Why are you traversing the whole page even when there is no match?
There is a similar problem in backward scan. I think this is blunder.

7.
+ if (so->currPos.buf == so->hashso_bucket_buf ||
+ so->currPos.buf == so->hashso_split_bucket_buf)
+ {
+ so->currPos.prevPage = InvalidBlockNumber;
+ LockBuffer(so->currPos.buf, BUFFER_LOCK_UNLOCK);
+ }
+ else
+ {
+ so->currPos.prevPage = (opaque)->hasho_prevblkno;
+ _hash_relbuf(rel, so->currPos.buf);
+ }
+
+ so->currPos.nextPage = (opaque)->hasho_nextblkno;

What makes you think it is safe read opaque after releasing the lock?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-04-04 10:29:05 Re: Page Scan Mode in Hash Index
Previous Message Kyotaro HORIGUCHI 2017-04-04 10:25:39 Re: asynchronous execution