| From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> | 
|---|---|
| To: | Florian Pflug <fgp(at)phlo(dot)org> | 
| Cc: | Daniel Farina <daniel(at)heroku(dot)com>, Chris Redekop <chris(at)replicon(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Hot Backup with rsync fails at pg_clog if under load | 
| Date: | 2011-10-25 09:13:14 | 
| Message-ID: | CA+U5nMJfnaay7p_rRzrvWLTyO-aKNEh414czr3jyXT8vhHiJVw@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On Tue, Oct 25, 2011 at 8:03 AM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
> We are starting recovery at the right place but we are initialising
> the clog and subtrans incorrectly. Precisely, the oldestActiveXid is
> being derived later than it should be, which can cause problems if
> this then means that whole pages are unitialised in subtrans. The bug
> only shows up if you do enough transactions (2048 is always enough) to
> move to the next subtrans page between the redo pointer and the
> checkpoint record while at the same time we do not have a long running
> transaction that spans those two points. That's just enough to happen
> reasonably frequently on busy systems and yet just enough to have
> slipped through testing.
>
> We must either
>
> 1. During CreateCheckpoint() we should derive oldestActiveXid before
> we derive the redo location
>
> 2. Change the way subtrans pages are initialized during recovery so we
> don't rely on oldestActiveXid
>
> I need to think some more before a decision on this in my own mind,
> but I lean towards doing (1) as a longer term fix and doing (2) as a
> short term fix for existing releases. I expect to have a fix later
> today.
(1) looks the best way forwards in all cases.
Patch attached. Will be backpatched to 9.0
I think it is possible to avoid taking XidGenLock during
GetRunningTransactions() now, but I haven't included that change in
this patch.
Any other comments before commit?
-- 
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
| Attachment | Content-Type | Size | 
|---|---|---|
| oldestActiveXid_fixed.v1.patch | application/octet-stream | 4.9 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Magnus Hagander | 2011-10-25 10:15:21 | Re: pgsql_fdw, FDW for PostgreSQL server | 
| Previous Message | Shigeru Hanada | 2011-10-25 09:11:00 | pgsql_fdw, FDW for PostgreSQL server |