Questions about Partitioning

From: Phoenix Kiula <phoenix(dot)kiula(at)gmail(dot)com>
To: PG-General Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Questions about Partitioning
Date: 2011-04-19 15:56:31
Message-ID: BANLkTikEB1rVYiBG9jTRLxkesXKq2wBXHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

While I fix some bigger DB woes, I have learned a lesson. Huge indexes
and tables are a pain.

Which makes me doubly keen on looking at partitioning.

Most examples I see online are partitioned by date. As in months, or
quarter, and so on. This doesn't work for me as I don't have too much
logic required based on time.

The biggest, highest volume SELECT in my database happens through an
"alias" column. This is an alphanumeric column. The second-biggest
SELECT happens through the "userid" column -- because many users check
their account every day.

A rough table definition can be considered as follows:

CREATE TABLE maintable
id SERIAL primary key
alias VARCHAR(42)
...
user_id VARCHAR(30)
user_registered BOOLEAN
status VARCHAR(1)

My questions:

1. Which column should I partition by -- the "alias" because it's the
largest contributor of queries? This should be OK, but my concern is
that when user_id queries are happening, then the data for the same
user will come through many subtables that are partitioned by "alias"
-- will this happen automatically (presuming "constraint exclusion" is
on)? How does partitioning by one column affect queries on others.
Will there be subtable-by-subtable indexes on both alias and

2. How does SERIAL type work with partitions? Will INSERT data go into
the respective partitions and yet maintain an overall sequence -- I
mean, the *same* overall sequence for the parent table distributed
automagically across subtables?

3. If I partition using "a%", "b%" etc up to "z%" as the partition
condition, is this an issue -- are about 26 subtables too many
partitions? Mine are static partitions as in they will be the same
forever, unlike data-based partitions. And each partition will
continue to grow. If I include that "alias"es can begin with numbers
and allowed symbols too, then this may be 45 partitions? What's the
limit of partitions -- not only official limit, but practical limit in
terms of performance?

4. Given that it's a wildcard LIKE condition (with a "%") will this
affect the index and subsequent SELECT speed? Are partition conditions
recommended to be "=" or "<" type operators only or is LIKE ok??

5. Does partitioning need to happen only through one column? Can I
have a condition containing two columns instead?

CREATE TABLE subtable_a (
PRIMARY KEY (id)
CHECK ( user_id LIKE 'a%' and user_registered IS TRUE)
) INHERITS (maintable);

CREATE TABLE subtable_b (
PRIMARY KEY (id),
CHECK ( user_id LIKE 'b%' and user_registered IS TRUE)
) INHERITS (maintable);

......etc....

6. Triggers - how do they affect speed? Everything, insert, update,
select will happen through this conditional trigger. I will likely be
writing this in PLSQL, but I read in several websites that C triggers
are much faster than PLSQL triggers. Is this a concern?

7. "Constraint exclusion" - is it recommended to have this in the
pg.conf, or will I need to do this before every SQL? I prefer the
pg.conf way, but want to confirm that there are no downsides for other
regular SQL operations with this setting?

8. How will JOIN work? I have different tables JOINing with the parent
table now. With partitioned subtables, will constraint exclusion
automatically do what's needed and my SQL does not need to change? Or
will there be triggers required for each and every query I currently
have?

Eight questions is enough for my first post in this partitioning thread :)

Thanks much!

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Benjamin Smith 2011-04-19 17:07:52 Re: SSDs with Postgresql?
Previous Message Taras Kopets 2011-04-19 15:26:30 Re: how to force an insert before Raise Exception?