Re: Improving spin-lock implementation on ARM.

From: Krunal Bauskar <krunalbauskar(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving spin-lock implementation on ARM.
Date: 2020-12-14 12:36:46
Message-ID: CAB10pyZHPiYU7=QsfMPXuxFf_eFvnsoW3gjJzDnKgiV6wUjsOQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Wondering if we can take this to completion (any idea what more we could
do?).

On Thu, 10 Dec 2020 at 14:48, Krunal Bauskar <krunalbauskar(at)gmail(dot)com>
wrote:

>
> On Tue, 8 Dec 2020 at 14:33, Krunal Bauskar <krunalbauskar(at)gmail(dot)com>
> wrote:
>
>>
>>
>> On Thu, 3 Dec 2020 at 21:32, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>>> Krunal Bauskar <krunalbauskar(at)gmail(dot)com> writes:
>>> > Any updates or further inputs on this.
>>>
>>> As far as LSE goes: my take is that tampering with the
>>> compiler/platform's default optimization options requires *very*
>>> strong evidence, which we have not got and likely won't get. Users
>>> who are building for specific hardware can choose to supply custom
>>> CFLAGS, of course. But we shouldn't presume to do that for them,
>>> because we don't know what they are building for, or with what.
>>>
>>> I'm very willing to consider the CAS spinlock patch, but it still
>>> feels like there's not enough evidence to show that it's a universal
>>> win. The way to move forward on that is to collect more measurements
>>> on additional ARM-based platforms. And I continue to think that
>>> pgbench is only a very crude tool for testing spinlock performance;
>>> we should look at other tests.
>>>
>>
>> Thanks Tom.
>>
>> Given pg-bench limited option I decided to try things with sysbench to
>> expose
>> the real contention using zipfian type (zipfian pattern causes part of
>> the database
>> to get updated there-by exposing main contention point).
>>
>>
>> ----------------------------------------------------------------------------
>> *Baseline for 256 threads update-index use-case:*
>> - 44.24% 174935 postgres postgres [.]
>> s_lock
>> transactions:
>> transactions: 5587105 (92988.40 per sec.)
>>
>> *Patched for 256 threads update-index use-case:*
>> 0.02% 80 postgres postgres [.] s_lock
>> transactions:
>> transactions: 10288781 (171305.24 per sec.)
>>
>> *perf diff*
>>
>> * 0.02% +44.22% postgres [.] s_lock*
>> ----------------------------------------------------------------------------
>>
>> As we see from the above result s_lock is exposing major contention that
>> could be relaxed using the
>> said cas patch. Performance improvement in range of 80% is observed.
>>
>> Taking this guideline we decided to run it for all scalability for update
>> and non-update use-case.
>> Check the attached graph. Consistent improvement is observed.
>>
>> I presume this should help re-establish that for major contention cases
>> existing tas approach will always give up.
>>
>>
>> -------------------------------------------------------------------------------------------
>>
>> Unfortunately, I don't have access to different ARM arch except for
>> Kunpeng or Graviton2 where
>> we have already proved the value of the patch.
>> [ref: Apple M1 as per your evaluation patch doesn't show regression for
>> select. Maybe if possible can you try update scenarios too].
>>
>> Do you know anyone from the community who has access to other ARM arches
>> we can request them to evaluate?
>> But since it is has proven on 2 independent ARM arch I am pretty
>> confident it will scale with other ARM arches too.
>>
>>
>
> Any direction on how we can proceed on this?
>
> * We have tested it with both cloud vendors that provide ARM instances.
> * We have tested it with Apple M1 (partially at-least)
> * Ampere use to provide instance on packet.com but now it is an
> evaluation program only.
>
> No other active arm instance offering a cloud provider.
>
> Given our evaluation so far has proven to be +ve can we think of
> considering it on basis of the available
> data which is quite encouraging with 80% improvement seen for heavy
> contention use-cases.
>
>
>
>>
>>> From a system structural standpoint, I seriously dislike that lwlock.c
>>> patch: putting machine-specific variant implementations into that file
>>> seems like a disaster for maintainability. So it would need to show a
>>> very significant gain across a range of hardware before I'd want to
>>> consider adopting it ... and it has not shown that.
>>>
>>> regards, tom lane
>>>
>>
>>
>> --
>> Regards,
>> Krunal Bauskar
>>
>
>
> --
> Regards,
> Krunal Bauskar
>

--
Regards,
Krunal Bauskar

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2020-12-14 13:05:14 Re: a misbehavior of partition row movement (?)
Previous Message Fujii Masao 2020-12-14 12:31:26 Re: Add Information during standby recovery conflicts