Re: vacuumdb question/problem

From: David Ondrejik <David(dot)Ondrejik(at)noaa(dot)gov>
To: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: vacuumdb question/problem
Date: 2011-07-21 14:44:37
Message-ID: 4E283B55.5080206@noaa.gov
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Thanks to everyone for their response and help. I still have some
more questions that hopefully someone can help me with as I have not yet
been able to solve my vacuumdb problem.

The posting of data to the table in question is extremely
slow...yesterday I saw that it took over 6 min to post just 124 rows of
data. That is just not acceptable. Additionally, we have about 9,000 to
11,000 products that come in daily (some contain one row of
data...others may be several hundred lines) and are posted to the
database. The system used to run with maybe 10-100 products in the queue
(that is before I posted over 200 million rows). Yesterday, there were
over 25,000 products in the queue waiting to be processed - which is bad
for our operational use.

I think the table got even more bloated when I tried to do the vacuum
full last week and had to kill the process - it never completed.

From the info I received from previous posts, I am going to change my
game plan. If anyone has thoughts as to different process or can confirm
that I am on the right track, I would appreciate your input.

1. I am going to run a CLUSTER on the table instead of a VACUUM FULL.
But I have a few questions that are not clear from the info I have found
on the internet regarding this process. The table name is 'pecrsep',
and the database is hd_ob92rha. It has multiple columns in the table,
but I am choosing the 'obsdate' as my index (I believe 'index' refers to
the name of a column in the table?). Some of the columns are listed:

lid | pe1 | pe2 | dur | idur | t | s | e | p | obstime | z0000 | z0015 |
z0030 | z0045 | z0100 | z0115 | z0130 ... etc

Anyway, if I interpret the info correctly the first time a CLUSTER is
run, I have to run something like:

CLUSTER obsdate ON pecrsep (CLUSTER /indexname/ ON /tablename/)

Is it true that the first time I run a CLUSTER, I must provide an
'indexname' or can I just run: CLUSTER pecrsep (CLUSTER tablename)? Or
is it better to provide an indexname?

2. I know there are a lot of variables such as table size, etc. Are we
looking at something that will take couple hours...or 24+ hours?

Is there a way to monitor the CLUSTER process to ensure its working?

Is there any way to get output to give an estimate as to how much of the
CLUSTER has completed...and how much is left to run?

3. With the info from previous posts, I am certainly rethinking the use
of VACUUM FULL.

After I run the CLUSTER, should I REINDEX the table, or is that
redundant to the CLUSTER?

If I run the REINDEX, I would do it just for the single table: REINDEX
TABLE pecrsep;

Any idea as to how long this may take (i.e 1-2 hours or 24+ hours)?

4. If either/both the CLUSTER or REINDEX are successful, that would
indicate that I don't need to run a VACUUM FULL...correct?

5. Then I would run a VACUUM ANALYZE as suggested.

Does this sound like an appropriate plan?

After I do all this on the single table, should I repeat something
similar for the whole database? Or should I just attack the largest tables?

One last question, can anyone recommend a good postgres book for me to
purchase? I guess I should repeat that we are still running version
8.2.6 of postgres on Linux machines. Is there an appropriate book for
this version available?

Thanks again for your help...I hope I didn't ask too many questions, but
the database is in poor shape and I need to get it working more
efficiently quickly.

Best Regards,
Dave

Attachment Content-Type Size
david_ondrejik.vcf text/x-vcard 309 bytes

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Kevin Grittner 2011-07-21 15:06:45 Re: how to make PostgreSQL using "all" memory and chaching the DB completely there
Previous Message Tom Lane 2011-07-21 14:36:56 Re: Out of memory