Re: Sync Rep and shutdown Re: Sync Rep v19

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, Yeb Havinga <yebhavinga(at)gmail(dot)com>, Jaime Casanova <jaime(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sync Rep and shutdown Re: Sync Rep v19
Date: 2011-03-18 22:42:10
Message-ID: AANLkTimvL6q8U3DYv9VS7jg2R-LQbYs+JYCEPTbuDxTT@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Responding to this again, somewhat out of order...

On Fri, Mar 18, 2011 at 1:15 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> Together that's about a >20% hit in performance in Yeb's tests. I think
> you should spend a little time thinking how to retune that.

I've spent some time playing around with pgbench and so far I haven't
been able to reliably reproduce this, which is not to say I don't
believe the effect is real, but rather that either I'm doing something
completely wrong, or it requires some specific setup to measure that
doesn't match my environment, or that it's somewhat finicky to
reproduce, or some combination of the above.

> You've added a test inside the lock to see if there is a standby, which
> I took out for performance reasons. Maybe there's another way, I know
> that code is fiddly.

It seems pretty easy to remove the branch from the test at the top of
the function by just rearranging things a bit. Patch attached; does
this help?

> You've also added back in the lock acquisition at wakeup with very
> little justification, which was a major performance hit.

I have a very difficult time believing this is a real problem. That
extra lock acquisition and release only happens if WaitLatchOrSocket()
returns but MyProc->syncRepState still appears to be SYNC_REP_WAITING.
That should only happen if the latch wait hits the timeout (which
takes 60 s!) or if the precise memory ordering problem that was put in
to fix is occurring (in which case it should dramatically *improve*
performance, by avoiding an extra 60 s wait). I stuck in a call to
elog(LOG, "got here") and it didn't fire even once in a 5-minute
pgbench test (~45k transactions). So I have a hard time crediting
this for any performance problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment Content-Type Size
sync-standbys-defined-rearrangement.patch application/octet-stream 1.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-03-18 22:47:54 Re: Re: [COMMITTERS] pgsql: Efficient transaction-controlled synchronous replication.
Previous Message Bruce Momjian 2011-03-18 22:41:35 pgsql: Document the all-balls IPv6 address.