Re: pg_upgrade broken by xlog numbering

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Heikki Linnakangas <heikki(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_upgrade broken by xlog numbering
Date: 2012-06-25 16:06:25
Message-ID: CA+TgmoYpLwLARH5yzOLir9uzpgGAsKN-tOo-ZpUF9KAqNP13Ug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 25, 2012 at 11:50 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On MacOS X, on latest sources, initdb fails:
>
>> creating directory /Users/rhaas/pgsql/src/test/regress/./tmp_check/data ... ok
>> creating subdirectories ... ok
>> selecting default max_connections ... 100
>> selecting default shared_buffers ... 32MB
>> creating configuration files ... ok
>> creating template1 database in
>> /Users/rhaas/pgsql/src/test/regress/./tmp_check/data/base/1 ... ok
>> initializing pg_authid ... ok
>> initializing dependencies ... ok
>> creating system views ... ok
>> loading system objects' descriptions ... ok
>> creating collations ... ok
>> creating conversions ... ok
>> creating dictionaries ... FATAL:  control file contains invalid data
>> child process exited with exit code 1
>
> Same for me.  It's crashing here:
>
>    if (ControlFile->state < DB_SHUTDOWNED ||
>        ControlFile->state > DB_IN_PRODUCTION ||
>        !XRecOffIsValid(ControlFile->checkPoint))
>        ereport(FATAL,
>                (errmsg("control file contains invalid data")));
>
> state == DB_SHUTDOWNED, so the problem is with the XRecOffIsValid test.
> ControlFile->checkPoint == 19972072 (0x130BFE8), what's wrong with that?
>
> (I suppose the reason this is only failing on some machines is
> platform-specific variations in xlog entry size, but it's still a bit
> distressing that this got committed in such a broken state.)

I'm guessing that the problem is as follows: in the old code, the
XLogRecord header could not be split, so any offset that was closer to
the end of the page than SizeOfXLogRecord was a sure sign of trouble.
But commit 061e7efb1b4c5b8a5d02122b7780531b8d5bf23d relaxed that
restriction, so now it IS legal for the checkpoint record to be where
it is. But it seems that XRecOffIsValid() didn't get the memo.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2012-06-25 16:07:55 Re: warning handling in Perl scripts
Previous Message David E. Wheeler 2012-06-25 15:58:40 Re: warning handling in Perl scripts