Re: Thousands of tables versus on table?

From: Thomas Andrews <tandrews(at)soliantconsulting(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Mark Lewis <mark(dot)lewis(at)mir3(dot)com>, <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Thousands of tables versus on table?
Date: 2007-06-04 19:57:41
Message-ID: C289E6F5.A74%tandrews@soliantconsulting.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 6/4/07 3:43 PM, "Gregory Stark" <stark(at)enterprisedb(dot)com> wrote:

>
> "Thomas Andrews" <tandrews(at)soliantconsulting(dot)com> writes:
>
>> I guess my real question is, does it ever make sense to create thousands of
>> tables like this?
>
> Sometimes. But usually it's not a good idea.
>
> What you're proposing is basically partitioning, though you may not actually
> need to put all the partitions together for your purposes. Partitioning's main
> benefit is in the management of the data. You can drop and load partitions in
> chunks rather than have to perform large operations on millions of records.
>
> Postgres doesn't really get any faster by breaking the tables up like that. In
> fact it probably gets slower as it has to look up which of the thousands of
> tables you want to work with.
>
> How often do you update or delete records and how many do you update or
> delete? Once per day is a very low frequency for vacuuming a busy table, you
> may be suffering from table bloat. But if you never delete or update records
> then that's irrelevant.

It looks like the most inserts that have occurred in a day is about 2000.
The responders table has 1.3 million records, the responses table has 50
million records. Most of the inserts are in the responses table.

>
> Does reindexing or clustering the table make a marked difference?
>

Clustering sounds like it might be a really good solution. How long does a
cluster command usually take on a table with 50,000,000 records? Is it
something that can be run daily/weekly?

I'd rather not post the schema because it's not mine - I'm a consultant. I
can tell you our vacuum every night is taking 2 hours and that disk IO is
the real killer - the CPU rarely gets higher than 20% or so.

=thomas

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Douglas J Hunley 2007-06-04 20:01:08 Re: upgraded to pgsql 8.2.4, getting worse performance then 7.4.x
Previous Message Gregory Stark 2007-06-04 19:43:38 Re: Thousands of tables versus on table?