Named vs Unnamed Partitions

From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
Subject: Named vs Unnamed Partitions
Date: 2008-01-08 19:08:50
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


IMO, the lengthy discussion about Segment Exclusion and Segment
Visibility Maps has long turned into a discussion about partitioning in
general. I'm thankful for all the new insights it has brought me and I
want to continue sharing my view on things. What's following is highly
theoretical and has brainstorming characteristics. You've been warned.

There are two very distinct ways to handle partitioning. For now, I'm
calling them named and unnamed partitioning. Let's have a closer look at
both options from a users point of view. I'm using Andrew's pseudo DDL
example from the above mentioned thread:

SET read_only='t'
WHERE created_on < '2007-01-01';

Given all tuples were read-writeable before, that implicitly created two
partitions. Giving them names could look like that:

SPLIT INTO old_foos AND new_foos;
AT created_on < '2007-01-01'

Instead of only setting the read-only property, one could also set an
alternative table space for the partition:

SET TABLE SPACE large_but_slow_storage
WHERE created_on < '2007-01-01';


SET TABLE SPACE large_but_slow_storage;

Please also note, that neither variant is limited to range partitioning.
You can theoretically partition by pretty much anything, for example
with a WHERE clause like:

..WHERE (id % 5) < 2

The primary difference I see between these two ways to declare
partitions is, that the former only modifies tuple properties
(read-only, storage location), while the later also tells the database
*why* it has to modify them.

That has several different effects. First, newly inserted tuples are
treated differently. For unnamed partitions, there must be defaults,
like read-writable and a default table space. With named partitions, you
define split points, so I guess one expects newly inserted tuples to end
up in the right partition automatically. Unnamed partitioning could be
equally automatic when letting a function decide, where to insert the
new tuple.

Second, repartitioning must be treated differently. With unnamed
partitioning, the admin must first adjust the defaults (if required) and
then move the existing tuple properties accordingly. With named
partitions, the admin only needs to adjust the split point and the
database system knows what it has to do.

And third, but IMO most importantly: to be able to optimize queries, the
database system has to know split points, so it can exclude partitions
or segments from scanning. Obviously, with named partitions, it always
knows them. Otherwise, you'll have to maintain some information about
the tuples in your partitions, as Simon does with the min/max tuples. As
soon as required, it could also maintain additional min/max values, i.e.
for (id % 5) for the above example.

I hope to have shown the most relevant aspects. To conclude, I'd say
that named partitioning is closer to manually managed partitioning, as
already known and often used. While unnamed partitioning is closer to
automated partitioning, where the DBA does *not need* to have names for
partitions, which is a pretty new and interesting idea to me.




Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Chernow 2008-01-08 20:07:36 Re: Proposal - libpq Type System beta-0.8a (was PGparam)
Previous Message Stefan Kaltenbrunner 2008-01-08 18:51:15 Re: 8.3.0 release schedule (Was:Re: [BUGS] BUG #3852: Could not create complex aggregate)