Re: Page Scan Mode in Hash Index

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Page Scan Mode in Hash Index
Date: 2017-08-19 06:07:31
Message-ID: CAA4eK1+Pkq_F-efLA46tas9DnZzBMCb2O=5C-P3W+2kwxK6zuQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 11, 2017 at 6:51 PM, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com> wrote:
>>
>> 7.
>> _hash_kill_items(IndexScanDesc scan)
>> {
>> ..
>> + /*
>> + * If page LSN differs it means that the page was modified since the
>> + * last read. killedItems could be not valid so LP_DEAD hints apply-
>> + * ing is not safe.
>> + */
>> + page = BufferGetPage(buf);
>> + if (PageGetLSN(page) != so->currPos.lsn)
>> + {
>> + _hash_relbuf(rel, buf);
>> + return;
>> + }
>> ..
>> }
>>
>> How does this check cover the case of unlogged tables?
>
> Thanks for putting that point. It doesn't cover the case for unlogged
> tables. As suggested by you in one of your email in this mailing list, i am
> now not allowing vacuum to release lock on current page before acquiring
> lock on next page for unlogged tables. This will ensure that scan is always
> behind vacuum if they are running on the same bucket simultaneously.
> Therefore, there is danger in marking tuples as dead for unlogged pages even
> if they are not having any lsn.
>

In the last line, I guess you wanted to say "there is *no* danger
..."? Today, while again thinking about this part of the patch
(_hash_kill_items) it occurred to me that we can't rely on a pin on an
overflow page to guarantee that it is not modified by Vacuum.
Consider a case where vacuum started vacuuming the bucket before the
scan and then in-between scan overtakes it. Now, it is possible that
even if a scan has a pin on a page (and no lock), vacuum might clean
that page, if that happens, then we can't prevent the reuse of TID.
What do you think?

Few other comments:

1.
+ * On failure exit (no more tuples), we return FALSE with pin
+ * pin held on bucket page but no pins or locks held on overflow
+ * page.
*/
bool
_hash_next(IndexScanDesc scan, ScanDirection dir)

In the above part of comment 'pin' is used twice.

0003-Improve-locking-startegy-during-VACUUM-in-Hash-Index_v5

2.
- * not at all by the rearrangement we are performing here. To prevent
- * any concurrent scan to cross the squeeze scan we use lock chaining
- * similar to hasbucketcleanup. Refer comments atop hashbucketcleanup.
+ * not at all by the rearrangement we are performing here.

In _hash_squeezebucket, we still use lock chaining, so removing the
above comment doesn't seem like a good idea. I think you should copy
part of a comment from hasbucketcleanup starting from "There can't be
any concurrent .."

3.
_hash_freeovflpage()
{
..

* Concurrency issues are avoided by using lock chaining as
* described atop hashbucketcleanup.
..
}

After fixing #2, you need to change the function name in above comment.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-08-19 07:13:50 Re: possible encoding issues with libxml2 functions
Previous Message Peter Geoghegan 2017-08-19 03:23:31 Re: Crash report for some ICU-52 (debian8) COLLATE and work_mem values