Re: Recovery inconsistencies, standby much larger than primary

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Recovery inconsistencies, standby much larger than primary
Date: 2014-02-12 18:51:47
Message-ID: 32313.1392231107@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Greg Stark <stark(at)mit(dot)edu> writes:
>> (Or maybe the hot backup
>> process could just catch the files in this state if a table is rapidly
>> growing and it doesn't take care to avoid picking up new files that
>> appear after it starts?)

> That's a possible explanation I guess, but it doesn't seem terribly
> probable from a timing standpoint.

I did a bit of arithmetic using the cases you posted previously.
In the first case, where block 3634978 got written to 7141472,
you can make the numbers come out right if you assume that a page
was missing at the end of segment 1 --- that leads to the conclusion
that EOF exclusive of that missing page had been around 28.75 GB,
which squares well with the relation's size on master. However, it's
fairly hard to credit that the base backup would have collected
full-size or nearly full-size images of segments 2 through 28 while
not seeing segment 1 at full size. You'd have to assume that the
rel grew by a factor of ~14 while the base backup was in progress
--- and then didn't grow very much more afterwards. (What state
exactly did you measure the primary rel sizes in? Was it long
after the backup/restore, or did you rewind things somehow?)

The other examples seem to fit the theory a bit better, but this
one is hard to explain this way.

The other big problem for this theory is that you said in
http://www.postgresql.org/message-id/CAM-w4HPvJCBRVV3dXg8aj0WzkU08dHuX-XYbfDYQhNrn5bnTQg@mail.gmail.com

> What's worse is we created a new standby from the same base backup and
> replayed the same records and it didn't reproduce the problem.

If this were the explanation, it oughta be reproducible that way.

I still agree that XLogReadBufferExtended shouldn't be assuming that P_NEW
will not skip pages. But I think we have another bug in here somewhere.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2014-02-12 18:55:58 Re: Recovery inconsistencies, standby much larger than primary
Previous Message Robert Haas 2014-02-12 18:47:41 Re: WIP patch for Todo Item : Provide fallback_application_name in contrib/pgbench, oid2name, and dblink