Skip site navigation (1) Skip section navigation (2)

Re: Dynamic Partitioning using Segment Visibility Maps

From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
Subject: Re: Dynamic Partitioning using Segment Visibility Maps
Date: 2008-01-05 15:42:28
Message-ID: 477FA564.9010704@bluegap.ch (view raw or flat)
Thread:
Lists: pgsql-hackers
Hi,

Simon Riggs wrote:
 > On Fri, 2008-01-04 at 22:26 +0100, Markus Schiltknecht wrote:
 >
 >> I'm still puzzled about how a DBA is expected to figure out which
 >> segments to mark. Simon, are you assuming we are going to pass on
 >> segment numbers to the DBA one day?
 >
 > No Way!

Ah, I'm glad ;-)

Simon Riggs wrote:
> Skepticism is OK, but I'd like to get your detailed thoughts on this.
> I've been an advocate of the multi-tables approach now for many years,
> so I don't expect everybody to switch their beliefs on my say-so
> overnight. Let me make a few more comments in this area:

I've so far always thought about some sort of multi-relations approach 
for partitioning, yes. Let's see if I can get my mind around 
single-table partitioning.

> The main proposal deliberately has few, if any, knobs and dials. That's
> a point of philosophy that I've had views on previously: my normal
> stance is that we need some knobs to allow the database to be tuned to
> individual circumstances.
> 
> In this case, partitioning is way too complex to administer effectively
> and requires application changes that make it impossible to use for
> packaged applications. The latest Oracle TPC-H benchmark uses 10 pages
> of DDL to set it up and if I can find a way to avoid that, I'd recommend
> it to all. I do still want some knobs and dials, just not 10 pages
> worth, though I'd like yours and others' guidance on what those should
> be. Oracle have been responding to feedback with their new interval
> partitioning, but its still a multi-table approach in essence.

I can absolutely support your efforts to minimize knobs and 
configuration DDL. However, my current feeling is, that segments based 
partitioning complicates things, because the DBA doesn't have tools and 
commands to handle segments.

To satisfy all the different requirements of partitioning with segments 
based partitioning, we'd have to allow a table to span multiple table 
spaces. I'm not very keen on going that way.

However, what I certainly like is the automated split point definition. 
Instead of having to create tables by hand and "linking" them via 
inheritance and constraint exclusion, I have something very similar in 
mind, like what you proposed for marking read-only segments. Something like:

   SPLIT TABLE customers AT cust_name > 'n';

or:

   SPLIT TABLE inventory AT inv_id % 4 >= 2;

In my imagination, this should automatically create the underlying 
relations, i.e.:

   NOTICE: relations inventory__l and inventory__r have been created.

That way, the DBA could then handle those like normal relations, 
querying them or moving them to different table spaces like all other 
normal relations.

In a way, that's not so different from possible extensions on top of 
Segment Exclusion, except that the DBA additionally get a relation name 
to be able to address the set of segments which form a partition. Or put 
it the other way around: go for Segment Exclusion, but add some sort of 
a sentinel relation for each set of segments, to make them reachable for 
the DBA.

> My observation of partitioned databases is that they all work
> beautifully at the design stage, but problems emerge over time. A
> time-based range partitioned table can often have different numbers of
> rows per partition, giving inconsistent response times. A
> height-balanced approach where we make the partitions all the same size,
> yet vary the data value boundaries will give much more consistent query
> times and can be completely automated much more easily.

Uh.. well, consistent query time isn't the first thing I'm expecting 
from partitioning by time ranges. If I wanted consistent query times I'd 
rather use hash partition or something, no?

I'd even state, that one *wants* inconsistent response times when using 
time based range partitioning, by moving old, seldom used data to slower 
storage and keeping only a small amount of often used tuples on the 
faster disks, for example.

> The SVM concept doesn't cover everything that you can do with
> partitioning, but my feeling is it covers the main use cases well.

As I regard manageability to be the main advantage of partitioning, 
which you've intentionally left out for now, I disagree here.

How could SVM or Segment Exclusion potentially be covering what hash 
partitioning does? Maybe together with the ability to store different 
segments of a table on different table spaces. That could be considered 
an approach to range partitioning. But then, that would be the 
partitioning, and not SVM or Segment Exclusion. To me, both of SVM and 
SE look much more like an optimization for certain special cases and 
don't have much to do with partitioning.

Regards

Markus


In response to

Responses

pgsql-hackers by date

Next:From: Robert TreatDate: 2008-01-05 16:59:46
Subject: Re: Dynamic Partitioning using Segment Visibility Maps
Previous:From: Markus SchiltknechtDate: 2008-01-05 15:04:54
Subject: Re: Dynamic Partitioning using Segment Visibility Maps

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group