Re: Review: Hot standby

From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Review: Hot standby
Date: 2008-11-26 10:22:35
Message-ID: 2e78013d0811260222p710effc6j59c0b4f10ae76fa6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

ISTM that the redo conflict resolution is not working as intended. I did the
following test and it throws some surprises.

On standby:

postgres=# BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE ;
BEGIN
postgres=# SELECT * from test;
a | b
-----+---
102 |
103 |
(2 rows)

On primary:

postgres=# SELECT * from test;
a | b
-----+---
102 |
103 |
(2 rows)

postgres=#
postgres=# UPDATE test SET a = a + 100;
UPDATE 2
postgres=# VACUUM test;
VACUUM
postgres=# SELECT pg_switch_xlog();
pg_switch_xlog
----------------
0/2D000288
(1 row)

On standby (server log):

LOG: restored log file "00000001000000000000002D" from archive
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG: recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT: xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
<same message repeated>

The open transaction (see above) on the standby is not still not aborted and
if I query the table in the same transaction, I get:

(Note: the transaction is still open)
postgres=#
postgres=# SELECT * from test;
a | b
---+---
(0 rows)

I think whats happening is that ResolveRecoveryConflictWithVirtualXIDs() is
failing to abort the open transaction and it keeps trying for that,
everytime doubling the sleep time, so the LOG messages come less frequently
later, but they are never ending. Soon the sleep becomes exponentially
large.

Even though the standby has a open transaction, its obvious that the
cleanup_redo has also failed to abort the transaction. Thats why the tuples
have disappeared from the standby (most likely because they are cleaned up
by VACUUM).

Thanks,
Pavan

--
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rob Kirkbride 2008-11-26 10:34:20 Re: Enhancement to pg_dump
Previous Message Bernd Helmle 2008-11-26 09:54:01 Re: WIP: Automatic view update rules