Skip site navigation (1) Skip section navigation (2)

Re: [GENERAL] 8.1.4 - problem with PITR - .backup.done /

From: Rafael Martinez <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [GENERAL] 8.1.4 - problem with PITR - .backup.done /
Date: 2006-05-30 18:21:32
Message-ID: 1149013293.980.24.camel@linux.site (view raw or flat)
Thread:
Lists: pgsql-generalpgsql-hackers
On Tue, 2006-05-30 at 09:45 -0400, Tom Lane wrote:
> "Rafael Martinez, Guerrero" <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no> writes:
> > The problem was that 000000010000000800000010.0006D5E8.backup was
> > already archived, but under pg_xlog/archive_status/ there were two
> > files:
> > -------------------------------------------------
> > 000000010000000800000010.0006D5E8.backup.done
> > 000000010000000800000010.0006D5E8.backup.ready
> > -------------------------------------------------
> 
> > This situation should not happen, anyone has seen this problem before?
> 
> No, it shouldn't.  What I suspect is that XLogArchiveIsDone() got
> confused and created a duplicate .ready file.  It basically assumes
> that the only way its stat() calls can fail is ENOENT, ie, file not
> there ... but I wonder if they failed for some other reason instead.
> What sort of platform and filesystem is this on?
> 

This is on a AMD64 Opteron server with RHELAS4 / 2.6.9-34.ELsmp and ext3
filesystem. This is the first time this happens.

I cannot the postgres internals but after a quick look to the source
code ......

XLogArchiveIsDone() has this code in the final of the function:
-------------------------------------------------
 /* Race condition --- maybe archiver just finished, so recheck */
        StatusFilePath(archiveStatusPath, xlog, ".done");
        if (stat(archiveStatusPath, &stat_buf) == 0)
                return true;

        /* Retry creation of the .ready file */
        XLogArchiveNotify(xlog);
        return false;
}
-------------------------------------------------

What happens if we have a race condition and the archiver creates
a .done file between the last check for the .done file and the creation
of the .ready file by XLogArchiveNotify?

> Did you happen to make note of the mod times of the two files before
> deleting them?
> 

No, I did not :( If it happens again, I will do.

regards,
-- 
Rafael Martinez, <r(dot)m(dot)guerrero(at)usit(dot)uio(dot)no>
Center for Information Technology Services
University of Oslo, Norway

PGP Public Key: http://folk.uio.no/rafael/


In response to

Responses

pgsql-hackers by date

Next:From: Marc G. FournierDate: 2006-05-30 19:21:46
Subject: Re: anoncvs still slow
Previous:From: Andrew DunstanDate: 2006-05-30 18:16:51
Subject: Re: anoncvs still slow

pgsql-general by date

Next:From: Jim NasbyDate: 2006-05-30 19:16:14
Subject: Re: DB structure for logically similar objects in different
Previous:From: Daniel VeriteDate: 2006-05-30 18:12:20
Subject: Re: UTF-8 context of BYTEA datatype??

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group