Re: ERROR: could not open relation base/2757655/6930168: No such file or directory -- during warm standby setup

From: bricklen <bricklen(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: ERROR: could not open relation base/2757655/6930168: No such file or directory -- during warm standby setup
Date: 2010-12-31 19:13:23
Message-ID: AANLkTinvZHHzX_auxnCcVDpEz9yi7pCRfUMQVsc6z+xL@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Dec 29, 2010 at 1:53 PM, bricklen <bricklen(at)gmail(dot)com> wrote:
> On Wed, Dec 29, 2010 at 12:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> The difference in ctid, and the values of xmin and relfrozenxid,
>> seems to confirm my suspicion that this wasn't just random cosmic rays.
>> You did something on the source DB that rewrote the table with a new
>> relfilenode (possibly CLUSTER or some form of ALTER TABLE; plain VACUUM
>> or ANALYZE wouldn't do it).  And for some reason the standby hasn't
>> picked up that change in the pg_class row.  I suspect the explanation
>> is that your technique for setting up the standby is flawed.  You can't
>> just rsync and have a valid snapshot of the DB --- you need to be sure
>> that enough WAL gets replayed to fix any inconsistencies arising from
>> the time-extended nature of the rsync operation.  But you didn't say
>> exactly how you did that.
>>
>
> Definitely no CLUSTER commands were issued, and there should have been
> no ALTER commands issued (yesterday was a holiday, no one was here).
> Would a TRUNCATE have the same effect though? I grep'd through our
> application, and it appears that at least 3 tables get truncated, one
> of them several times per hour. The often-truncated table wasn't one
> of the bad ones, but the others are the ones I've already identified
> as non-existent.

Update: Set up the warm standby again and encountered the same issue,
with two of the three previously-identified tables -- the ones that
can get truncated throughout the day. We're going to try again
overnight when those tables are not truncated and see if that gives us
a correctly-working standby.

From what I could find from posts to these lists, TRUNCATE commands do
reset the relfilenode, and that could account for the issue we are
experiencing. What I find odd is that we have one other table that is
truncated every 15 minutes (aggregate table) but that one was fine in
both attempts at the warm standby.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jon Nelson 2010-12-31 19:27:37 Re: ERROR: could not open relation base/2757655/6930168: No such file or directory -- during warm standby setup
Previous Message Carlos Mennens 2010-12-31 18:06:49 Rename Schema Removes Unique Constraints?