Re: Lazy hash table for XidInMVCCSnapshot (helps Zipfian a bit)

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, pgsql-hackers-owner(at)postgresql(dot)org
Subject: Re: Lazy hash table for XidInMVCCSnapshot (helps Zipfian a bit)
Date: 2017-08-10 15:12:34
Message-ID: CAPpHfdvO0fO_qB2wXWQJ4A7LHv66AdRMrcU63QOpu5=ezk+viw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 10, 2017 at 3:30 PM, Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
wrote:

> On 2017-07-31 18:56, Sokolov Yura wrote:
>
>> Good day, every one.
>>
>> In attempt to improve performance of YCSB on zipfian distribution,
>> it were found that significant time is spent in XidInMVCCSnapshot in
>> scanning snapshot->xip array. While overall CPU time is not too
>> noticable, it has measurable impact on scaleability.
>>
>> First I tried to sort snapshot->xip in GetSnapshotData, and search in a
>> sorted array. But since snapshot->xip is not touched if no transaction
>> contention occurs, sorting xip always is not best option.
>>
>> Then I sorted xip array on demand in XidInMVCCSnapshot only if
>> search in snapshot->xip occurs (ie lazy sorting). It performs much
>> better, but since it is O(NlogN), sort's execution time is noticable
>> for large number of clients.
>>
>> Third approach (present in attached patch) is making hash table lazily
>> on first search in xip array.
>>
>> Note: hash table is not built if number of "in-progress" xids is less
>> than 60. Tests shows, there is no big benefit from doing so (at least
>> on Intel Xeon).
>>
>> With regards,
>>
>
> Here is new, more correct, patch version, and results for pgbench with
> zipfian distribution (50% read 50% write) (same to Alik's tests at
> https://www.postgresql.org/message-id/BF3B6F54-68C3-417A-BFA
> B-FB4D66F2B410(at)postgrespro(dot)ru )
>
>
> clients | master | hashsnap2 | hashsnap2_lwlock
> --------+--------+-----------+------------------
> 10 | 203384 | 203813 | 204852
> 20 | 334344 | 334268 | 363510
> 40 | 228496 | 231777 | 383820
> 70 | 146892 | 148173 | 221326
> 110 | 99741 | 111580 | 157327
> 160 | 65257 | 81230 | 112028
> 230 | 38344 | 56790 | 77514
> 310 | 22355 | 39249 | 55907
> 400 | 13402 | 26899 | 39742
> 500 | 8382 | 17855 | 28362
> 650 | 5313 | 11450 | 17497
> 800 | 3352 | 7816 | 11030
>
> (Note: I'm not quite sure, why my numbers for master are lower than
> Alik's one. Probably, our test methodology differs a bit. Perhaps, it
> is because I don't recreate tables between client number change, but
> between branch switch).
>

These results look very cool!
I think CSN is eventually inevitable, but it's a long distance feature.
Thus, this optimization could make us a good serve before we would have CSN.
Do you expect there are cases when this patch causes slowdown? What about
case when we scan each xip array only once (not sure how to simulate that
using pgbench)?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-08-10 15:18:23 Re: parallelize queries containing initplans
Previous Message Robert Haas 2017-08-10 15:11:06 Re: pl/perl extension fails on Windows