Skip site navigation (1) Skip section navigation (2)

Re: Review: Hot standby

From: "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Review: Hot standby
Date: 2008-11-26 10:22:35
Message-ID: 2e78013d0811260222p710effc6j59c0b4f10ae76fa6@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
ISTM that the redo conflict resolution is not working as intended. I did the
following test and it throws some surprises.

On standby:

postgres=# BEGIN TRANSACTION ISOLATION LEVEL SERIALIZABLE ;
BEGIN
postgres=# SELECT * from test;
  a  | b
-----+---
 102 |
 103 |
(2 rows)


On primary:

postgres=# SELECT * from test;
  a  | b
-----+---
 102 |
 103 |
(2 rows)

postgres=#
postgres=# UPDATE test SET a = a + 100;
UPDATE 2
postgres=# VACUUM test;
VACUUM
postgres=# SELECT pg_switch_xlog();
 pg_switch_xlog
----------------
 0/2D000288
(1 row)


On standby (server log):

LOG:  restored log file "00000001000000000000002D" from archive
LOG:  recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT:  xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG:  recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT:  xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG:  recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT:  xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG:  recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT:  xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG:  recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT:  xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG:  recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT:  xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
LOG:  recovery cancels activity of virtual transaction 2/2 pid 10593 because
it blocks exclusive locks (current delay now 5 secs)
CONTEXT:  xlog redo exclusive relation lock: slot 99 db 11517 rel 24576
<same message repeated>


The open transaction (see above) on the standby is not still not aborted and
if I query the table in the same transaction, I get:

(Note: the transaction is still open)
postgres=#
postgres=# SELECT * from test;
 a | b
---+---
(0 rows)


I think whats happening is that ResolveRecoveryConflictWithVirtualXIDs() is
failing to abort the open transaction and it keeps trying for that,
everytime doubling the sleep time, so the LOG messages come less frequently
later, but they are never ending. Soon the sleep becomes exponentially
large.

Even though the standby has a open transaction, its obvious that the
cleanup_redo has also failed to abort the  transaction. Thats why the tuples
have disappeared from the standby (most likely because they are cleaned up
by VACUUM).


Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com

In response to

Responses

pgsql-hackers by date

Next:From: Rob KirkbrideDate: 2008-11-26 10:34:20
Subject: Re: Enhancement to pg_dump
Previous:From: Bernd HelmleDate: 2008-11-26 09:54:01
Subject: Re: WIP: Automatic view update rules

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group