| From: | daveg <daveg(at)sonic(dot)net> | 
|---|---|
| To: | hubert depesz lubaczewski <depesz(at)depesz(dot)com> | 
| Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: [GENERAL] pg_upgrade problem | 
| Date: | 2011-08-29 22:23:31 | 
| Message-ID: | 20110829222331.GB10597@sonic.net | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general pgsql-hackers | 
On Mon, Aug 29, 2011 at 07:49:24PM +0200, hubert depesz lubaczewski wrote:
> On Mon, Aug 29, 2011 at 06:54:41PM +0200, hubert depesz lubaczewski wrote:
> vacuumdb: vacuuming of database "etsy_v2" failed: ERROR:  could not access status of transaction 3429738606
> DETAIL:  Could not open file "pg_clog/0CC6": No such file or directory.
> 
> Interestingly.
> 
> In old dir there is pg_clog directory with files:
> 0AC0 .. 0DAF (including 0CC6, size 262144)
> but new pg_clog has only:
> 0D2F .. 0DB0
> 
> File content - nearly all files that exist in both places are the same, with exception of 2 newest ones in new datadir:
> 3c5122f3e80851735c19522065a2d12a  0DAF
> 8651fc2b9fa3d27cfb5b496165cead68  0DB0
> 
> 0DB0 doesn't exist in old, and 0DAF has different md5sum: 7d48996c762d6a10f8eda88ae766c5dd
> 
> one more thing. I did select count(*) from transactions and it worked.
> 
> that's about it. I can probably copy over files from old datadir to new (in
> pg_clog/), and will be happy to do it, but I'll wait for your call - retry with
> copies files might destroy some evidence.
I had this same thing happen this Saturday just past and my client had to
restore the whole 2+ TB instance from the previous days pg_dumps.
I had been thinking that perhaps I did something wrong in setting up or
running the upgrade, but had not found it yet. Now that I see Hubert has
the same problem it is starting to look like pg_upgrade can eat all your
data.
After running pg_upgrade apparently successfully and analyzeing all the
tables we restarted the production workload and started getting errors:
2011-08-27 04:18:34.015  12337  c06  postgres  ERROR:  could not access status of transaction 2923961093
2011-08-27 04:18:34.015  12337  c06  postgres  DETAIL:  Could not open file "pg_clog/0AE4": No such file or directory.
2011-08-27 04:18:34.015  12337  c06  postgres  STATEMENT:  analyze public.b_pxx;
On examination the pg_clog directory contained on two files timestamped
after the startup of the new cluster with 9.0.4. Other hosts that upgraded
successfully had numerous files in pg_clog dating back a few days. So it
appears that all the clog files went missing during the upgrade somehow.
a
This happened upgrading from 8.4.7 to 9.0.4, with a brief session in between
at 8.4.8. We have upgraded several hosts to 9.0.4 successfully previously.
-dg
-- 
David Gould       daveg(at)sonic(dot)net      510 536 1443    510 282 0869
If simplicity worked, the world would be overrun with insects.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Lonni J Friedman | 2011-08-29 22:23:38 | Re: heavy swapping, not sure why | 
| Previous Message | Merlin Moncure | 2011-08-29 22:17:40 | Re: heavy swapping, not sure why | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Josh Kupershmidt | 2011-08-29 23:34:36 | Re: dropdb and dropuser: IF EXISTS | 
| Previous Message | Ants Aasma | 2011-08-29 21:07:18 | spinlocks on HP-UX |