Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1

From: Andres Freund <andres(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1
Date: 2013-11-20 12:00:35
Message-ID: 20131120120035.GA25406@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2013-11-20 12:48:50 +0200, Heikki Linnakangas wrote:
> On 19.11.2013 16:22, Andres Freund wrote:
> >On 2013-11-19 15:20:01 +0100, Andres Freund wrote:
> >>Imo something the attached patch should be done. The description I came
> >>up with is:
> >>
> >> Fix Hot-Standby initialization of clog and subtrans.
>
> Looks ok for a back-patchable fix.
>
> It's a bit bizarre that the ExtendSUBTRANS loop in
> ProcArrayApplyRecoveryInfo looks different from the one in
> RecordKnownAssignedTransactionIds, but both look correct to me.

That's because of the different upper bounds (nextxid) vs xid]), but
yea, I wondered about that as well.

> In master, it'd be nice to do some further cleanup. Some gripes:
>
> In ProcArrayApplyXidAssignment, I wonder if it would be best to just remove
> the (standbyState == STANDBY_INITIALIZED) check altogether. The
> KnownAssignedXidsRemoveTree() that follows is harmless if there is nothing
> in the known-assigned-xids array (xact_redo_commit does it in
> STANDBY_INITIALIZED state too). The other thing that's done after that check
> is updating lastOverflowedXid, and AFAICS it would be harmless to update
> that, too. It will be overwritten by the ProcArrayApplyRecoveryInfo() call
> that comes later.

I was thinking about removing it entirely in the patch, but chose not to
do so. I don't really care which way we go.

> Clog, subtrans and multixact are all handled differently. Extensions of clog
> and multixact are WAL-logged, but extensions of subtrans are not. They all
> have a Startup function, but it has a slightly different purpose.
> StartupCLOG only sets latest_page_number, but StartupSUBTRANS and
> StartupMultiXact zero out the current page. For CLOG, the TrimCLOG()
> function does that. Why is clog handled differently from multixact?

I'd guess clog and multixact are handled differently because multixact
supposedly is never queried during recovery. But I don't that's actually
still true, thinking of 9.3's changes around fkey locks and
HeapTupleGetUpdateXid().
So it's probably time to split StartupMultiXact similar to clog's routines.

> StartupCLOG() and StartupMultiXact set latest_page_number, but
> StartupSUBTRANS does not. Is that a problem for subtrans?

I don't think it is, the difference is that StartupSUBTRANS() zeroes out
the current subtrans page which will set latest_page_number, the other's
access the pages normally, which doesn't set it.

> StartupCLOG() and
> StartupMultiXact() are called at different stages in hot standby -
> StartupCLOG() is called at the beginning of recovery, but StartupMultiXact()
> is only called at end of recovery. Why the discrepancy, does
> latest_page_number need to be set during hot standby or not?

latest_page_number primarily is an optimization, isn't it? Except for a
safeguard check in SimpleLruTruncate() it should only influence victim
buffer initialization. But: slru.c explicitly doesn't initialize
->latest_page_number, which means it's zeroed from a memset slightly
above. Which seems we might cause problems when performing truncations
during recovery, before the first page is zeroed (which'd set
latest_page_number again).
...
Hm. Do we actually *ever* truncate the multixact slru during recovery?
clog.c's truncations are WAL logged, TruncateSUBTRANS() is performed by
restartpoints, but there's no callers to TruncateMultiXact but
vac_truncate_clog and it's not logged? That doesn't seem right.

> I think we should bite the bullet, and WAL-log the extension of subtrans,
> too. Then make the startup and extension procedure for all the SLRUs the
> same.

Hm. I don't really see a big advantage in that? I am all for trying to
bring more symetry to the startup routines, but I don't think forcing
WAL logging for something scrapped every restart is necessary for that.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Rajeev rastogi 2013-11-20 12:10:38 Re: COPY table FROM STDIN doesn't show count tag
Previous Message Rodolfo Campero 2013-11-20 11:53:38 Re: information schema parameter_default implementation