Re: Page Scan Mode in Hash Index

From: Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>
To: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Page Scan Mode in Hash Index
Date: 2017-03-22 13:32:11
Message-ID: CAE9k0P=QfrT+ZvLrVXDPiVL61FKjc35H2eQHGHaz687n2vCGVQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

>> Attached patch modifies hash index scan code for page-at-a-time mode.
>> For better readability, I have splitted it into 3 parts,
>>
>
> Due to the commits on master these patches applies with hunks.
>
> The README should be updated to mention the use of page scan.

Done. Please refer to the attached v2 version of patch.

>
> hash.h needs pg_indent.

Fixed.

>
>> 1) 0001-Rewrite-hash-index-scans-to-work-a-page-at-a-time.patch: this
>> patch rewrites the hash index scan module to work in page-at-a-time
>> mode. It basically introduces two new functions-- _hash_readpage() and
>> _hash_saveitem(). The former is used to load all the qualifying tuples
>> from a target bucket or overflow page into an items array. The latter
>> one is used by _hash_readpage to save all the qualifying tuples found
>> in a page into an items array. Apart from that, this patch bascially
>> cleans _hash_first(), _hash_next and hashgettuple().
>>
>
> For _hash_next I don't see this - can you explain ?

Sorry, It was wrongly copied from btree code. I have corrected it now. Please
check the attached v2 verison of patch.

>
> + *
> + * On failure exit (no more tuples), we release pin and set
> + * so->currPos.buf to InvalidBuffer.
>
>
> + * Returns true if any matching items are found else returns false.
>
> s/Returns/Return/g

Done.

>
>> 2) 0002-Remove-redundant-function-_hash_step-and-some-of-the.patch:
>> this patch basically removes the redundant function _hash_step() and
>> some of the unused members of HashScanOpaqueData structure.
>>
>
> Looks good.
>
>> 3) 0003-Improve-locking-startegy-during-VACUUM-in-Hash-Index.patch:
>> this patch basically improves the locking strategy for VACUUM in hash
>> index. As the new hash index scan works in page-at-a-time, vacuum can
>> release the lock on previous page before acquiring a lock on the next
>> page, hence, improving hash index concurrency.
>>
>
> + * As the new hash index scan work in page at a time mode,
>
> Remove 'new'.

Done.

>
>> I have also done the benchmarking of this patch and would like to
>> share the results for the same,
>>
>> Firstly, I have done the benchmarking with non-unique values and i
>> could see a performance improvement of 4-7%. For the detailed results
>> please find the attached file 'results-non-unique values-70ff', and
>> ddl.sql, test.sql are test scripts used in this experimentation. The
>> detail of non-default GUC params and pgbench command are mentioned in
>> the result sheet. I also did the benchmarking with unique values at
>> 300 and 1000 scale factor and its results are provided in
>> 'results-unique-values-default-ff'.
>>
>
> I'm seeing similar results, and especially with write heavy scenarios.

Great..!!

--
With Regards,
Ashutosh Sharma
EnterpriseDB:http://www.enterprisedb.com

Attachment Content-Type Size
0001-Rewrite-hash-index-scans-to-work-a-page-at-a-timev2.patch application/x-patch 23.6 KB
0002-Remove-redundant-function-_hash_step-and-some-of-the.patch application/x-patch 8.4 KB
0003-Improve-locking-startegy-during-VACUUM-in-Hash-Index.patch application/x-patch 1.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Surafel Temesgen 2017-03-22 13:34:53 Re: New CORRESPONDING clause design
Previous Message David Rowley 2017-03-22 13:19:25 Re: Patch to improve performance of replay of AccessExclusiveLock