Re: pg_upgrade bug found!

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, RhodiumToad on IRC <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_upgrade bug found!
Date: 2011-04-07 20:18:24
Message-ID: BANLkTi=eeMJEFzT0VDXkPG_vNWCtU6Okfg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 7, 2011 at 3:46 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> Jeff Davis wrote:
>> > I have added a personal regression test to show which
>> > pg_class.relfrozenxid values are not preserved, and with this patch the
>> > only ones not preserved are toast tables used by system tables, which
>> > are not copied from the old cluster (FirstNormalObjectId = 16384).  I am
>> > attaching that old/new pg_class.relfrozenxid diff as well.
>> >
>> > Any idea how to correct existing systems?  Would VACUUM FREEZE of just
>> > the toast tables work?
>>
>> VACUUM FREEZE will never set the relfrozenxid backward. If it was never
>> preserved to begin with, I assume that the existing value could be
>> arbitrarily before or after, so it might not be updated.
>>
>> I think that after you VACUUM FREEZE the toast table, then the real
>> oldest frozen xid (as opposed to the bad value in relfrozenxid for the
>> toast table) would have to be the same or newer than that of the heap.
>> Right? That means you could safely copy the heap's relfrozenxid to the
>> relfrozenxid of its toast table.
>
> OK, so the only other idea I have is to write some pretty complicated
> query function that does a sequential scan of each toast table and pulls
> the earliest xmin/xmax from the tables and use that to set the
> relfrozenxid (pretty complicated because it has to deal with the freeze
> horizon and wraparound).
>
>> > I perhaps could create a short DO block that
>> > would vacuum freeze just toast tables;  it would have to be run in every
>> > database.
>>
>> Well, that won't work, because VACUUM can't be executed in a transaction
>> block or function.
>
> Good point.
>
> The only bright part of this is that missing clog will throw an error so
> we are not returning incorrect data, and hopefully people will report
> problems to us when it happens.
>
> Ideally I would like to get this patch and correction code out into the
> field in case more people run into this problem.  I know some will, I
> just don't know how many.

ISTM we need to force a minor release once we are sure this has been
corrected. We had also probably put out an announcement warning
people that have already used pg_upgrade of possible data corruption.
I'm not sure exactly what the language around that should be, but this
does seem pretty bad.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-04-07 20:20:35 Re: Windows build issues
Previous Message Magnus Hagander 2011-04-07 20:13:27 Re: Windows build issues