Re: 121+ million record table perf problems

From: Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: 121+ million record table perf problems
Date: 2007-05-18 18:30:18
Message-ID: 20070518183018.GN10921@phlogiston.dyndns.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, May 18, 2007 at 12:43:40PM -0500, cyber-postgres(at)midnightfantasy(dot)com wrote:
> I've got a table with ~121 million records in it. Select count on it
> currently takes ~45 minutes, and an update to the table to set a value on
> one of the columns I finally killed after it ran 17 hours and had still
> not completed. Queries into the table are butt slow, and

I don't think you've told us anything like enough to get started on
solving your problem. But to start with, you know that in Postgres,
an unrestricted count() on a table always results in reading the
entire table, right?

Standard questions: have you performed any vacuum or analyse?

Your update statement is also a case where you have to touch every
row. Note that, given that you seem to be setting the state field to
the same value for everything, an index on there will do you not one
jot of good until there's greater selectivity.

How fast is the disk? Is it fast enough to read and touch every one
of those rows on the table inside of 17 hours?

Note also that your approach of updating all 121 million records in
one statement is approximately the worst way to do this in Postgres,
because it creates 121 million dead tuples on your table. (You've
created some number of those by killing the query as well.)

All of that said, 17 hours seems kinda long.

> As a test I am trying to do an update on state using the following queries:
> update res set state=5001;
> select count(resid) from res;

What is this testing?

> The update query that started this all I had to kill after 17hours.

Does that suggest that the update you're trying to make work well is
_not_ update res set state = 5001?

> each) and is running on a single disk (guess I will likely have to at the
> minimum go to a RAID1). Workload will primarily be comprised of queries

I bet that single disk is your problem. Iostat is your friend, I'd
say.

A

--
Andrew Sullivan | ajs(at)crankycanuck(dot)ca
Everything that happens in the world happens at some place.
--Jane Jacobs

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Joshua D. Drake 2007-05-18 18:51:04 Re: 121+ million record table perf problems
Previous Message Tyrrill, Ed 2007-05-18 18:30:12 Slow queries on big table