Re: Partition table in 9.0.x?

From: AJ Weber <aweber(at)comcast(dot)net>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Partition table in 9.0.x?
Date: 2013-01-08 16:45:49
Message-ID: 50EC4D3D.9060209@comcast.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


>
> It probably does, but from psql command line, you can do \d+ and \di+
\d+ doesn't appear to display any size information.

>
> If you have little control over your storage and are already IO bound,
> and the tables are growing rapidly, you may need to rethink that
> "deletes are rare" bit. So the inserts and updates do target a hot
> part, while the selects are evenly spread?
>
> In that case, it is very important to know if the slow part are the
> selects, or the insert and deletes. If the selects are slow, and the
> hot rows for selects can't be gathered together into a hot partition,
> then after clustering they will still be slow as the disk will still
> have to seek all over the place (massive data-mining type selects
> might be an exception to that, but I wouldn't count on it).
Since order_num is sequential, I could partition on it in broad
(sequential) ranges. That would put all recent/new rows in one
table-partition that would be a fraction of the size of the overall
(unpartitioned) table. I guess that would require manual maintenance
over-time (to switch to another, new partition as each grows).

>
> I think it is a matter of semantics. A small table is poor candidate
> for partitioning even if it has an excellent key to use for
> partitioning. A large table could be a good candidate up until you
> realize it doesn't have a good key to use, at which point it stops
> being a good candidate (in my opinion).
>
My first idea to evenly-partition the table was to use the order_num and
do a "mod" on it with the number of tables I wanted to use. That would
yield a partition-table number of 0-mod, and all rows for the same order
would stay within the same partition-table. However, you're right in
thinking that a search for orders could -- really WOULD -- require
retrieving details from multiple partitions, probably increasing IO. So
maybe the sequential partitioning (if at all) is better, just more
maintenance down-the-road.
>
> Was the order_num (from the parent table) the leading field of the 4
> column PK? If not, you might want to reorder the PK so that it is the
> leading field and cluster again. Or if reordering the PK columns is
> not convenient, make a new index on the order_num and cluster on that
> (perhaps dropping the index after the cluster, if it no longer serves
> a purpose)
>
Yes, the order_num is the first column in the PK, and our main browse
queries use, at a minimum, the first 2-3 columns in that PK in their
where-clause.

Many thanks again for all the input!
-AJ

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2013-01-08 17:21:42 Re: Partition table in 9.0.x?
Previous Message Patrick Dung 2013-01-08 16:26:50 Re: Sub optimal performance with default setting of Postgresql with FreeBSD 9.1 on ZFS