Re: max_standby_delay considered harmful

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Florian Pflug <fgp(at)phlo(dot)org>, Dimitri Fontaine <dfontaine(at)hi-media(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org, Bruce Momjian <bruce(at)momjian(dot)us>, Greg Smith <greg(at)2ndquadrant(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: max_standby_delay considered harmful
Date: 2010-05-10 06:27:44
Message-ID: 1273472865.3936.1954.camel@ebony
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 2010-05-09 at 20:56 -0400, Robert Haas wrote:

> >> > Seems like it could take FOREVER on a busy system. Surely that's not
> >> > OK. The fact that Hot Standby has to take exclusive locks that can't
> >> > be released until WAL replay has progressed to a certain point seems
> >> > like a fairly serious wart.
> >>
> >> If this is a serious wart then it's not one of hot standby, but one of
> >> postgres proper. AccessExclusiveLocks (SELECT-blocking locks that is, as
> >> opposed to UPDATE/DELETE-blocking locks) are never necessary from a
> >> correctness POV, they're only there for implementation reasons.
> >>
> >> Getting rid of them doesn't seem completely insurmountable either - just as
> >> multiple row versions remove the need to block SELECTs dues to concurrent
> >> UPDATEs, multiple datafile versions could remove the need to block SELECTs
> >> due to concurrent ALTERs. But people seem to live with them quite well,
> >> judged from the amount of work put into getting rid of them (zero). I
> >> therefore fail to see why they should pose a significant problem in HS
> >> setups.
> > The difference is that in HS you have to wait for a moment where *no exclusive
> > lock at all* exist, possibly without contending for any of them, while on the
> > master you might not even blocked by the existence of any of those locks.
> >
> > If you have two sessions which in overlapping transactions lock different
> > tables exlusively you have no problem shutting the master down, but you will
> > never reach a point where no exclusive lock is taken on the slave.
>
> A possible solution to this in the shutdown case is to kill anyone
> waiting on a lock held by the startup process at the same time we kill
> the startup process, and to kill anyone who subsequently waits for
> such a lock as soon as they attempt to take it.

I already explained that killing the startup process first is a bad idea
for many reasons when shutdown was discussed. Can't remember who added
the new standby shutdown code recently, but it sounds like their design
was pretty poor if it didn't include shutting down properly with HS. I
hope they fix the bug they have introduced. HS was never designed to
work that way, so there is no flaw there; it certainly worked when
committed.

> I'm not sure if this
> would also make sense in the pause case.

Not sure why pausing replay would make any difference at all. Being
between one WAL record and the next is a valid and normal state that
exists many thousands of times per second. If making that state longer
would cause problems we would already have seen any issues. There are
none, it will work fine.

> Another possible solution would be to try to figure out if there's a
> way to delay application of WAL that requires the taking of AELs to
> the point where we could apply it all at once. That might not be
> feasible, though, or only in some cases, and it's certainly 9.1
> material (at least) in any case.

Locks usually protect users from accessing a table while its being
clustered or dropped or something like that. Locks are not bad. They are
also used by some developers to specifically serialize access to an
object. AccessExclusiveLocks are rare in normal running and not to be
avoided when they do exist. HS correctly supports locking, as and when
such locks are made on the master.

--
Simon Riggs www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Takahiro Itagaki 2010-05-10 06:34:19 Re: "SET search_path" clause ignored during function creation
Previous Message Tom Lane 2010-05-10 03:36:55 Re: 9.0b1: "ERROR: btree index keys must be ordered by attribute"