Re: REINDEX takes half a day (and still not complete!)

From: tv(at)fuzzy(dot)cz
To: "Phoenix Kiula" <phoenix(dot)kiula(at)gmail(dot)com>
Cc: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>, "Merlin Moncure" <mmoncure(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: REINDEX takes half a day (and still not complete!)
Date: 2011-04-18 15:15:10
Message-ID: befaf241e6259678f5961b6f2bdffa48.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> Thanks. But let me do the "top" stuff later. I think I have a bigger
> problem now.
>
> While doing a PG dump, I seem to get this error:
>
> ERROR: invalid memory alloc request size 4294967293
>
> Upon googling, this seems to be a data corruption issue!
>
> One of the older messages suggests that I do "file level backup and
> restore the data" -
> http://archives.postgresql.org/pgsql-admin/2008-05/msg00191.php
>
> How does one do this -- should I copy the data folder? What are the
> specific steps to restore from here, would I simply copy the files
> from the data folder back to the new install or something? Cant find
> these steps in the PG documentation.

Just stop the database, and copy the 'data' directory somewhere else (to a
different machine prefferably). You can then start the database from this
directory copy (not sure how that works in CentOS, but you can always run
"postmaster -D directory").

>
> I'm on PG 8.2.9, CentOS 5, with 8GB of RAM.
>

This is a massive thread (and part of the important info is in another
thread other mailing lists), so maybe I've missed something important, but
it seems like:

1) You're I/O bound (according to the 100% utilization reported by iostat).

2) Well, you're running RAID1 setup, which basically means it's 1 drive
(and you're doing reindex, which means a lot of read/writes).

3) The raid controller should handle this, unless it's broken, the battery
is empty (and thus the writes are not cached) or something like that. I'm
not that familiar with 3ware - is there any diagnostic tool that you use
to check the health of the controller / drives?

4) I know you've mentioned there is no bloat (according to off-the-list
discussion with Merlin) - is this true for the table only? Because if the
index is not bloated, then there's no point in running reindex ...

BTW what is the size of the database and that big table? I know it's 125
million rows, but how much is that? 1GB, 1TB, ... how much? What does
this return

SELECT reltuples FROM pg_class WHERE relname = 'links';

Do you have any pg_dump backups? What size are they, compared to the live
database? Havou you tried to rebuild the database from these backups? That
would give you a fresh indexes, so you could see how a 'perfectly clean'
database looks (whether the indexes bloated, what speed is expected etc.).

regards
Tomas

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Samuel Gendler 2011-04-18 16:01:44 Re: Is there a way to selective dump of records in Postgres 9.0.3?
Previous Message Nikolas Everett 2011-04-18 15:11:26 Re: Is there a way to selective dump of records in Postgres 9.0.3?