Quick Links

Re: performance of partitioning?

From:	Brent Wood <b(dot)wood(at)niwa(dot)co(dot)nz>
To:	George Nychis <gnychis(at)cmu(dot)edu>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: performance of partitioning?
Date:	2007-02-27 20:12:24
Message-ID:	45E490A8.50204@niwa.co.nz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

George Nychis wrote:
> Hey all,
>
> So I have a master table called "flows" and 400 partitions in the
> format "flow_*" where * is equal to some epoch.
>
> Each partition contains ~700,000 rows and has a check such that 1
> field is equal to a value:
> "flows_1107246900_interval_check" CHECK ("interval" = '2005-02-01
> 03:35:00'::timestamp without time zone)
>
> Each partition has a different and unique non-overlapping check.
>
> This query takes about 5 seconds to execute:
> dp=> select count(*) from flows_1107246900;
> count
> --------
> 696836
> (1 row)
>
> This query has been running for 10 minutes now and hasn't stopped:
> dp=> select count(*) from flows where interval='2005-02-01 03:35:00';
>
> Isn't partitioning supposed to make the second query almost as fast?
> My WHERE is exactly the partitioning constraint, therefore it only
> needs to go to 1 partition and execute the query.
>
> Why would it take magnitudes longer to run? Am i misunderstanding
> something?

We have a db with only 200,000,000 records, partitioned by year with
about 15 partitions. There is a clustered index on the timestamp field
and queries like a 25 wide self join for 3 months data are around 20
seconds. On a desktop box with a single SATA drive.

If you are querying by timestamp, I suggest a clustered index will help.

Brent Wood

In response to

performance of partitioning? at 2007-02-27 14:00:09 from George Nychis

Browse pgsql-general by date

	From	Date	Subject
Next Message	Jimmy Zhang	2007-02-27 20:13:21	[ANN]VTD-XML 2.0
Previous Message	Lenorovitz, Joel	2007-02-27 19:57:51	How to use OIDs on tables......OR....a better solution?