Hot Standby performance and deadlocking

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Hot Standby performance and deadlocking
Date: 2010-05-25 10:12:55
Message-ID: 1274782375.6203.1461.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Some performance problems have been reported on HS from two users: Erik
and Stefan.

The characteristics of those issues have been that performance is
* sporadically reduced, though mostly runs at full speed
* context switch storms reported as being associated

So we're looking for something that doesn't always happen, but when it
does it involves lots of processes and context switching.

Unfortunately neither test reporter has been able to re-run tests,
leaving me not much to go on. Though since I know the code well, I can
focus in on likely suspects fairly easily; in this case I think I have a
root cause.

Earlier this year I added deadlock detection into Startup process when
it waits for a buffer pin. The deadlock detection was simplified since
it doesn't wait for deadlock_timeout before acting, it just immediately
sends a signal to all active processes to resolve the deadlock, even if
the buffer pin is released very soon afterwards. Heikki questioned this
implementation at the time, though I said it was easier to start simple
and add more code if problems arose and time allowed. It's clear that
with 100+ connections and reasonably frequent buffer pin waits, as would
occur when accessing same data blocks on both primary and standby, that
the current too-simple coding would cause performance issues, as Heikki
implied. Certainly actual deadlocks are much rarer than buffer pin
waits, so the current coding is wasteful.

The following patch adds some simple logic to make the Startup process
wait for deadlock_timeout before it sends the deadlock resolution
signals. It does that by refactoring the API to
enable_standby_sigalrm(), though doesn't change other behaviour or add
new features.

Viewpoints?

--
Simon Riggs www.2ndQuadrant.com

Attachment Content-Type Size
hs_deadlock_timeout.patch text/x-patch 8.8 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-05-25 10:18:37 Re: ROLLBACK TO SAVEPOINT
Previous Message Fujii Masao 2010-05-25 10:12:49 Re: recovery getting interrupted is not so unusual as it used to be