Re: ERROR: could not open relation base/2757655/6930168: No such file or directory -- during warm standby setup

From: bricklen <bricklen(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: ERROR: could not open relation base/2757655/6930168: No such file or directory -- during warm standby setup
Date: 2010-12-29 19:53:00
Message-ID: AANLkTikeyUHaWW6Tc5_CWvxwSW5efQNyTH8P-XjmJLy8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Dec 29, 2010 at 11:35 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> bricklen <bricklen(at)gmail(dot)com> writes:
>> After setting up a warm standby
>> (pg_start_backup/rsync/pg_stop_backup), and promoting to master, we
>> encountered an error in the middle of an analyze of the new standby
>> db. (the standby server is a fresh server)
>> [ relfilenode doesn't match on source and standby ]
>
> What can you tell us about what was happening on the source DB while
> the backup was being taken?  In particular I'm wondering if anything
> that would've given offer2offer a new relfilenode was in progress.
> Also, does the pg_class entry for offer2offer have the same xmin and
> ctid in both DBs?
>
>                        regards, tom lane

A couple other notes:
- There are two tables that are affected, not just one. I ran
individual ANALYZE commands on every table in the db and found that.
- The rsync command that we are using uses the "-e ssh -p" switch so
we can specify a port number

rsync -av -e "ssh -p 9001" --progress --partial -z /var/lib/pgsql/data
postgres(at)standby-tunnel:/var/lib/pgsql/

The pg_start_backup/pg_stop_backup range was about 10 hours, as the
transfer took that long (480GB transfer).

Sorry for my ignorance, I don't

The source db has between 1000 and 3000 transactions/s, so is
reasonably volatile. The two tables in question are not accessed very
heavily though.

Looking at the ctid and xmin between both databases, no, they don't
seem to match exactly. Pardon my ignorance, but would those have
changed due to vacuums, analyze, or any other forms of access?

Source offer2offer:
select ctid,xmin,* from pg_class where relname='offer2offer';
-[ RECORD 1 ]--+--------------------------------------------------------------------
ctid | (142,2)
xmin | 1228781192
relname | offer2offer
relnamespace | 2200
reltype | 2760224
relowner | 10
relam | 0
relfilenode | 6946955
reltablespace | 0
relpages | 5216
reltuples | 324642
reltoastrelid | 2760225
reltoastidxid | 0
relhasindex | f
relisshared | f
relistemp | f
relkind | r
relnatts | 12
relchecks | 0
relhasoids | f
relhaspkey | f
relhasrules | f
relhastriggers | f
relhassubclass | f
relfrozenxid | 1228781185

Standby offer2offer:
select ctid,xmin,* from pg_class where relname='offer2offer';
-[ RECORD 1 ]--+---------------------------------------------------------------------
ctid | (142,1)
xmin | 1227738244
relname | offer2offer
relnamespace | 2200
reltype | 2760224
relowner | 10
relam | 0
relfilenode | 6930168
reltablespace | 0
relpages | 5210
reltuples | 324102
reltoastrelid | 2760225
reltoastidxid | 0
relhasindex | f
relisshared | f
relistemp | f
relkind | r
relnatts | 12
relchecks | 0
relhasoids | f
relhaspkey | f
relhasrules | f
relhastriggers | f
relhassubclass | f
relfrozenxid | 1227738213

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2010-12-29 20:11:01 Re: ERROR: could not open relation base/2757655/6930168: No such file or directory -- during warm standby setup
Previous Message Tom Lane 2010-12-29 19:35:21 Re: ERROR: could not open relation base/2757655/6930168: No such file or directory -- during warm standby setup