Re: Improving spin-lock implementation on ARM.

From: Krunal Bauskar <krunalbauskar(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improving spin-lock implementation on ARM.
Date: 2020-12-01 15:19:06
Message-ID: CAB10pyboVUQkkkBTSJ9G7s-U+aaVBZGerGQuAbKixBZ-uuxarg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 1 Dec 2020 at 20:25, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
wrote:

> On Tue, Dec 1, 2020 at 3:44 PM Krunal Bauskar <krunalbauskar(at)gmail(dot)com>
> wrote:
> > I have completed benchmarking with lse.
> >
> > Graph attached.
>
> Thank you for benchmarking.
>
> Now I agree with this comment by Tom Lane
>
> > In general, I'm pretty skeptical of *all* the results posted so far on
> > this thread, because everybody seems to be testing exactly one machine.
> > If there's one thing that it's safe to assume about ARM, it's that
> > there are a lot of different implementations; and this area seems very
> > very likely to differ across implementations.
>
> Different ARM implementations look too different. As you pointed out,
> LSE is enabled in gcc-10 by default. I doubt we can accept a patch,
> which gives benefits for specific platform and only when the compiler
> isn't very modern. Also, we didn't cover all ARM planforms. Given
> they are so different, we can't guarantee that patch doesn't cause
> regression of some ARM. Additionally, the effect of the CAS patch
> even for Kunpeng seems modest. It makes the drop off of TPS more
> smooth, but it doesn't change the trend.
>

There are 2 parts:

** Does CAS patch help scale PGSQL. Yes.*
** Is LSE beneficial for all architectures. Probably No.*

The patch addresses only the former one which is true for all cases.
(Enabling LSE should be an independent process).

gcc-10 made it default but when I read [1] it quotes that canonical decided
to remove it as default
as part of* Ubuntu-20.04 which means LSE has not proven the test of
canonical (probably).*
Also, most of the distro has not yet started shipping GCC-10 which is way
far before it makes it to all distro.

So if we keep the LSE effect aside and just look at the patch from
performance improvement it surely helps
achieve a good gain. I see an improvement in the range of 10-40%.
Amit during his independent testing also observed the gain in the same
range and your testing with G-2 has re-attested the same point.
Pardon me if this is modest as per pgsql standards.

With 1024 scalability PGSQL on other arches (beyond ARM) struggle to scale
so there is something more
inherent that needs to be addressed from a generic perspective.

Also, the said patch non-only helps pgbench kind of workload but other
workloads too.

--------------

I would request you guys to re-think it from this perspective to help
ensure that PGSQL can scale well on ARM.
s_lock becomes a top-most function and LSE is not a universal solution but
CAS surely helps ease the main bottleneck.

And surely let me know if more data is needed.

Link:
[1]:
https://www.postgresql.org/message-id/flat/099F69EE-51D3-4214-934A-1F28C0A1A7A7%40amazon.com

> ------
> Regards,
> Alexander Korotkov
>

--
Regards,
Krunal Bauskar

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Anastasia Lubennikova 2020-12-01 15:23:55 Re: BUG #15383: Join Filter cost estimation problem in 10.5
Previous Message Anastasia Lubennikova 2020-12-01 15:05:43 Re: Reduce the time required for a database recovery from archive.