Re: Possible performance regression in version 10.1 with pgbench read-write tests.

From: Andres Freund <andres(at)anarazel(dot)de>
To: Mithun Cy <mithun(dot)cy(at)enterprisedb(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Possible performance regression in version 10.1 with pgbench read-write tests.
Date: 2018-07-20 19:29:49
Message-ID: 20180720192949.wjgmwvqbthnphnna@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2018-07-21 00:53:28 +0530, Mithun Cy wrote:
> On Fri, Jul 20, 2018 at 10:52 AM, Thomas Munro <
> thomas(dot)munro(at)enterprisedb(dot)com> wrote:
>
> > On Fri, Jul 20, 2018 at 7:56 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > >
> > > It's not *that* noticeable, as I failed to demonstrate any performance
> > > difference before committing the patch. I think some more investigation
> > > is warranted to find out why some other people are getting different
> > > results
> > Maybe false sharing is a factor, since sizeof(sem_t) is 32 bytes on
> > Linux/amd64 and we're probably hitting elements clustered at one end
> > of the array? Let's see... I tried sticking padding into
> > PGSemaphoreData and I got ~8% more TPS (72 client on multi socket
> > box, pgbench scale 100, only running for a minute but otherwise the
> > same settings that Mithun showed).
> >
> > --- a/src/backend/port/posix_sema.c
> > +++ b/src/backend/port/posix_sema.c
> > @@ -45,6 +45,7 @@
> > typedef struct PGSemaphoreData
> > {
> > sem_t pgsem;
> > + char padding[PG_CACHE_LINE_SIZE - sizeof(sem_t)];
> > } PGSemaphoreData;
> >
> > That's probably not the right idiom and my tests probably weren't long
> > enough, but there seems to be some effect here.
> >
>
> I did a quick test applying the patch with same settings as initial mail I
> have reported (On postgresql 10 latest code)
> 72 clients
>
> CASE 1:
> Without Patch : TPS 29269.823540
>
> With Patch : TPS 36005.544960. --- 23% jump
>
> Just Disabling using unnamed POSIX semaphores: TPS 34481.207959

> So it seems that is the issue as the test is being run on 8 node numa
> machine.

Cool. I think we should just backpatch that then. Does anybody want to
argue against?

> I also came across a presentation [1] : slide 20 which says one of those
> futex architecture is bad for NUMA machine. I am not sure the new fix for
> same is included as part of Linux version 3.10.0-693.5.2.el7.x86_64 which
> is on my test machine.

Similar issues are also present internally for sysv semas, so I don't
think this really means that much.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Minh-Quan Tran 2018-07-20 19:31:44 Re: Segfault logical replication PG 10.4
Previous Message Mithun Cy 2018-07-20 19:23:28 Re: Possible performance regression in version 10.1 with pgbench read-write tests.