Re: select count() out of memory

From: Erik Jones <erik(at)myemma(dot)com>
To: tfinneid(at)student(dot)matnat(dot)uio(dot)no
Cc: "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, "Gregory Stark" <stark(at)enterprisedb(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: select count() out of memory
Date: 2007-10-25 15:43:44
Message-ID: 2452A2A9-4E9D-44C2-9E30-DF369B1CABB1@myemma.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On Oct 25, 2007, at 10:36 AM, tfinneid(at)student(dot)matnat(dot)uio(dot)no wrote:

>>> The db worked fine until it reached perhaps 30-40 thousand
>>> partitions.
>>
>> It depends on how you have the partitions set up and how you're
>> accessing them. Are all of these partitions under the same parent
>> table? If so, then trying run a SELECT COUNT(*) against the parent
>> table is simply insane. Think about it, you're asking one query to
>> scan 55000 tables. What you need to do is partition based on your
>> access patterns, not what you *think* will help with performance down
>> the road. Look into constraint exclusion, whether or not you can
>> just access child tables directly, and whether you really need all of
>> these under one logical table. Also, no matter how you do the
>> partitioning, once you get up to that many and more relations in your
>> system, dumps and restores take a lot longer.
>
> The design is based on access patterns, i.e. one partition
> represents a
> group of data along a discrete axis, so the partitions are the
> perfect for
> modeling that. Only the last partition will be used on normal
> cases. The
> previous partitions only need to exists until the operator deletes
> them,
> which will be sometime between 1-6 weeks.
>
> Regarding dumps and restore; the system will always be offline during
> those operations and it will be so for several days, because a new
> project
> might start at another location in the world, so the travelling there
> takes time. In the mean time, all admin tasks can be performed without
> problems, even backup operations that take 3 days.

Excellent, it sounds like you should be fine then. One thing to
note: if you want to get an "idea" of how many rows you have in your
partitions, you can run a SUM aggregate on reltuples in pg_class for
all of your partitions. The more recent the last ANALYZE for each
table, the more accurate those values will be.

Erik Jones

Software Developer | Emma®
erik(at)myemma(dot)com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message tfinneid 2007-10-25 15:45:38 Re: select count() out of memory
Previous Message tfinneid 2007-10-25 15:36:09 Re: select count() out of memory