Re: [PATCH] audo-detect and use -moutline-atomics compilation flag for aarch64

From: "Zidenberg, Tsahi" <tsahee(at)amazon(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] audo-detect and use -moutline-atomics compilation flag for aarch64
Date: 2020-09-06 21:00:02
Message-ID: 1C8D0E58-FB33-4105-AC00-8FA07621F5DD@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello!

First, I apologize for taking so long to answer. This e-mail regretfully got lost in my inbox.

On 24/07/2020, 4:17, "Andres Freund" <andres(at)anarazel(dot)de> wrote:

> What does "not significantly affected" exactly mean? Could you post the
> raw numbers?

The following tests show benchmark behavior on m6g.8xl instance (32-core with LSE support)
and a1.4xlarge (16-core, no LSE support) with and without the patch, based on postgresql 12.4.
Tests are pgbench select-only/simple-update, and sysbench read-only/write only.

. select-only. simple-update. read-only. write-only
m6g.8xlarge/vanila. 482130. 56275. 273327. 33364
m6g.8xlarge/patch. 493748. 59681. 262702. 33024
a1.4xlarge/vanila. 82437. 13978. 62489. 2928
a1.4xlarge/patch. 79499. 13932. 62796. 2945

Results obviously change with OS / parameters /etc. I have attempted ensure a fair comparison,
But I don't think these numbers should be taken as absolute.
As reference points, m6g instance compiled with -march=native flag, and m5g (x86) instances:

m6g.8xlarge/native. 522771. 60354. 261366. 33582
m5.8xlarge. 362908. 58732. 147730. 32750

> I'm a bit concerned that the additional conditional
> branches on platforms without non ll/sc atomics could hurt noticably.

As can be seen in a1 results - the difference for CPUSs with no LSE atomic support is low.
Locks have one branch added, which is always taken the same way and thus easy to predict.

> I'm surprised that read-only didn't benefit - with ll/sc that ought to
> have pretty high contention on a few lwlocks.

These results show only about 6% performance increase in simple-update, and very close
performance in other results, most of which could be attributed to benchmark result jitter.
These results from "well behaved" benchmarks do not show the full importance of using
outline-atomics. I have observed in some experiments with other values and larger systems
a crush of performance including read-only tests, which was caused by continuously failing to
commit strx instructions. In such cases, outline-atomics improved performance by more
than 2x factor. These cases are not always easy to replicate.

Thank you!
and sorry again for the delay
Tsahi Zidenberg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-09-06 21:21:12 Re: Disk-based hash aggregate's cost model
Previous Message Justin Pryzby 2020-09-06 20:48:23 Re: v13: show extended stats target in \d