Re: Possible performance regression in version 10.1 with pgbench read-write tests.

From: Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>
To: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Possible performance regression in version 10.1 with pgbench read-write tests.
Date: 2018-07-20 19:23:28
Message-ID: CAD__OuiWigmaYRec3A4H3EuyNp0nJqqPF_+_BGiWtDs32mY64Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 20, 2018 at 10:52 AM, Thomas Munro <
thomas(dot)munro(at)enterprisedb(dot)com> wrote:

> On Fri, Jul 20, 2018 at 7:56 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > It's not *that* noticeable, as I failed to demonstrate any performance
> > difference before committing the patch. I think some more investigation
> > is warranted to find out why some other people are getting different
> > results
> Maybe false sharing is a factor, since sizeof(sem_t) is 32 bytes on
> Linux/amd64 and we're probably hitting elements clustered at one end
> of the array? Let's see... I tried sticking padding into
> PGSemaphoreData and I got ~8% more TPS (72 client on multi socket
> box, pgbench scale 100, only running for a minute but otherwise the
> same settings that Mithun showed).
>
> --- a/src/backend/port/posix_sema.c
> +++ b/src/backend/port/posix_sema.c
> @@ -45,6 +45,7 @@
> typedef struct PGSemaphoreData
> {
> sem_t pgsem;
> + char padding[PG_CACHE_LINE_SIZE - sizeof(sem_t)];
> } PGSemaphoreData;
>
> That's probably not the right idiom and my tests probably weren't long
> enough, but there seems to be some effect here.
>

I did a quick test applying the patch with same settings as initial mail I
have reported (On postgresql 10 latest code)
72 clients

CASE 1:
Without Patch : TPS 29269.823540

With Patch : TPS 36005.544960. --- 23% jump

Just Disabling using unnamed POSIX semaphores: TPS 34481.207959

So it seems that is the issue as the test is being run on 8 node numa
machine.
I also came across a presentation [1] : slide 20 which says one of those
futex architecture is bad for NUMA machine. I am not sure the new fix for
same is included as part of Linux version 3.10.0-693.5.2.el7.x86_64 which
is on my test machine.

[1] https://www.slideshare.net/davidlohr/futex-scaling-for-multicore-systems

--
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-07-20 19:29:49 Re: Possible performance regression in version 10.1 with pgbench read-write tests.
Previous Message Alvaro Herrera 2018-07-20 19:08:30 Re: Segfault logical replication PG 10.4