Re: Hot Standby 0.2.1

From: Simon Riggs <simon(at)2ndQuadrant(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot Standby 0.2.1
Date: 2009-09-22 07:22:41
Message-ID: 1253604161.4449.138.camel@ebony.2ndQuadrant
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Mon, 2009-09-21 at 19:42 -0700, Jeff Janes wrote:
> On Tue, Sep 15, 2009 at 2:41 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> >
> > OK, here is the latest version of the Hot Standby patchset. This is
> > about version 30+ by now, but we should regard this as 0.2.1
> > Patch against CVS HEAD (now): clean apply, compile, no known bugs.
> >
> > OVERVIEW
> >
> > You can download PDF versions of the fine manual is here
> > http://wiki.postgresql.org/images/0/01/Hot_Standby_main.pdf
>
>
> >From this doc:
>
> "In recovery, transactions will not be permitted to take any lock
> higher other than
> AccessShareLock or AccessExclusiveLock. In addition, transactions may never
> assign a TransactionId and may never write WAL. The LOCK TABLE command by
> default applies an AccessExclusiveLock. Any LOCK TABLE command that runs on
> the standby and requests a specific lock type other than AccessShareLock will be
> rejected."
>
> The first sentence seems to say that clients on the stand-by can take
> ACCESS EXCLUSIVE, while the last sentence seems to say that they
> cannot do so.

You are right to pick up that discrepancy, as Heikki did also.

The root cause of that discrepancy is a change in patch behaviour
between January and now that I will use your post to highlight and
discuss, if you don't mind. (and yes, the docs need to be corrected)

Initially, it seemed that it was certain that a read-only backend could
not take an AccessExclusiveLock. On further thought, there is no
particular reason to block AccessExclusiveLocks themselves, just that
most things you might do while holding one are banned. But the lock
itself is fine. (Any challenge on that?)

AccessExclusiveLocks can be used to serialize the actions of other
backends. That is a very common use case, so my concern was that LOCK
TABLE would be a no-op unless we allowed AccessExclusiveLock, so the
patch does allow it.

> I did a little experiment on a hot standby instance. I expected that
> either I would be denied the lock altogether, or the lock would cause
> WAL replay to be paused until either I committed or was forcibly
> canceled. But neither happened, I was granted the lock but WAL replay
> continued anyway.
>
> jjanes=# begin;
> BEGIN
> jjanes=# lock table pgbench_history in access exclusive mode;
> LOCK TABLE
> jjanes=# select count(*) from pgbench_history;
> count
> --------
> 519104
> (1 row)
>
> jjanes=# select count(*) from pgbench_history;
> count
> --------
> 527814
> (1 row)
>
> Is this the expected behavior?

By me, yes. WAL replay does not require a table lock to progress. Any
changes are protected with block-level locks. It does acquire a table
lock and cancel conflicting queries when it is about to replay something
that would cause a query to explode, such as dropping a table, as
explained in docs.

So this is not a bug.

The explanation of how the above sequence of events occurs is that the
backend acquires AccessExclusiveLock - please check on other session in
pg_locks. WAL replay continues by the Startup process, inserting further
rows into the pgbench_history table as a series of transactions. The
second select takes a later snapshot than the first and so sees more
data than the first select, hence a larger count. (And I am pleased to
see that recovery is progressing quickly even while your queries run).

So not a bug, but just one of many possible behaviours we could enforce.
1. Allow AccessExclusiveLocks yet they do not interrupt WAL replay
2. Allow AccessExclusiveLocks but have them pause WAL replay
3. Disallow AccessExclusiveLocks (and so LOCK TABLE is effectively a
no-op because it will not be able to serialize anything)

So the patch originally implemented (3) but now implements (1).

I would say that (2) is very undesirable because it puts WAL replay in
the control of non-superusers. That could mean LOCK TABLE implicitly
alters the high availability of the standby, and might even be used
maliciously to do that.

I'm open to views on whether we should use (1) or (3). Comments?

Implementing either is no problem and we have a straight choice. We may
even wish to review that again later from additional feedback.

(Jeff, you have also helped me understand that there is a bug in the way
serializable transactions are cancelled, which is easily corrected.
Thanks for that unexpected windfall, but it is otherwise unrelated to
your comments.)

--
Simon Riggs www.2ndQuadrant.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2009-09-22 08:04:15 Re: Hot Standby 0.2.1
Previous Message Heikki Linnakangas 2009-09-22 07:11:40 Re: Hot Standby 0.2.1