Re: [GENERAL] pg_upgrade problem

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: daveg <daveg(at)sonic(dot)net>
Cc: hubert depesz lubaczewski <depesz(at)depesz(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [GENERAL] pg_upgrade problem
Date: 2011-09-01 01:56:59
Message-ID: 201109010156.p811uxG19239@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

daveg wrote:
> On Mon, Aug 29, 2011 at 07:49:24PM +0200, hubert depesz lubaczewski wrote:
> > On Mon, Aug 29, 2011 at 06:54:41PM +0200, hubert depesz lubaczewski wrote:
> > vacuumdb: vacuuming of database "etsy_v2" failed: ERROR: could not access status of transaction 3429738606
> > DETAIL: Could not open file "pg_clog/0CC6": No such file or directory.
> >
> > Interestingly.
> >
> > In old dir there is pg_clog directory with files:
> > 0AC0 .. 0DAF (including 0CC6, size 262144)
> > but new pg_clog has only:
> > 0D2F .. 0DB0
> >
> > File content - nearly all files that exist in both places are the same, with exception of 2 newest ones in new datadir:
> > 3c5122f3e80851735c19522065a2d12a 0DAF
> > 8651fc2b9fa3d27cfb5b496165cead68 0DB0
> >
> > 0DB0 doesn't exist in old, and 0DAF has different md5sum: 7d48996c762d6a10f8eda88ae766c5dd
> >
> > one more thing. I did select count(*) from transactions and it worked.
> >
> > that's about it. I can probably copy over files from old datadir to new (in
> > pg_clog/), and will be happy to do it, but I'll wait for your call - retry with
> > copies files might destroy some evidence.
>
> I had this same thing happen this Saturday just past and my client had to
> restore the whole 2+ TB instance from the previous days pg_dumps.
> I had been thinking that perhaps I did something wrong in setting up or
> running the upgrade, but had not found it yet. Now that I see Hubert has
> the same problem it is starting to look like pg_upgrade can eat all your
> data.
>
> After running pg_upgrade apparently successfully and analyzeing all the
> tables we restarted the production workload and started getting errors:
>
> 2011-08-27 04:18:34.015 12337 c06 postgres ERROR: could not access status of transaction 2923961093
> 2011-08-27 04:18:34.015 12337 c06 postgres DETAIL: Could not open file "pg_clog/0AE4": No such file or directory.
> 2011-08-27 04:18:34.015 12337 c06 postgres STATEMENT: analyze public.b_pxx;
>
> On examination the pg_clog directory contained on two files timestamped
> after the startup of the new cluster with 9.0.4. Other hosts that upgraded
> successfully had numerous files in pg_clog dating back a few days. So it
> appears that all the clog files went missing during the upgrade somehow.
> a
> This happened upgrading from 8.4.7 to 9.0.4, with a brief session in between
> at 8.4.8. We have upgraded several hosts to 9.0.4 successfully previously.

I have posted this fix to the hackers email list, but I found it only
affected old 8.3 servers, not old 8.4.X, so I am confused by your bug
report.

I have tested 8.4.X to 9.0.4 and found pg_upgrade preserves toast
relfrozenxids properly in that case.

Can you tell me what table is showing this error? Does it happen during
vacuum? Can you run a vacuum verbose to see what it is throwing the
error on? Thanks.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2011-09-01 02:38:01 Re: toast error after pg_upgrade 9.0.4 -> 9.1rc1
Previous Message Bruce Momjian 2011-09-01 01:54:20 Re: [GENERAL] pg_upgrade problem

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2011-09-01 02:38:01 Re: toast error after pg_upgrade 9.0.4 -> 9.1rc1
Previous Message Bruce Momjian 2011-09-01 01:54:20 Re: [GENERAL] pg_upgrade problem