Quick Links

Re: Dynamic Partitioning using Segment Visibility Maps

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc:	pgsql-hackers(at)postgresql(dot)org, Markus Schiltknecht <markus(at)bluegap(dot)ch>, Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
Subject:	Re: Dynamic Partitioning using Segment Visibility Maps
Date:	2008-01-05 09:33:45
Message-ID:	1199525625.18598.229.camel@ebony.site
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 2008-01-04 at 22:31 -0500, Robert Treat wrote:

> Not to be negative, but istm how this feature would be managed is as important
> as the bits under the hood.

Agreed. On this part of the thread, we've been discussing an extension
to the basic proposal, which is why I have not been concentrating there.

Core management wise, the basic proposal showed how we would be able to
have VACUUM run much faster than before and how DELETE will also be
optimised naturally by this approach. Loading isn't any slower than it
is now; loading does need some work, but that's another story.

> Or at least we have to believe there will be
> some practical way to manage this, which as of yet I am skeptical.

Skepticism is OK, but I'd like to get your detailed thoughts on this.
I've been an advocate of the multi-tables approach now for many years,
so I don't expect everybody to switch their beliefs on my say-so
overnight. Let me make a few more comments in this area:

The main proposal deliberately has few, if any, knobs and dials. That's
a point of philosophy that I've had views on previously: my normal
stance is that we need some knobs to allow the database to be tuned to
individual circumstances.

In this case, partitioning is way too complex to administer effectively
and requires application changes that make it impossible to use for
packaged applications. The latest Oracle TPC-H benchmark uses 10 pages
of DDL to set it up and if I can find a way to avoid that, I'd recommend
it to all. I do still want some knobs and dials, just not 10 pages
worth, though I'd like yours and others' guidance on what those should
be. Oracle have been responding to feedback with their new interval
partitioning, but its still a multi-table approach in essence.

My observation of partitioned databases is that they all work
beautifully at the design stage, but problems emerge over time. A
time-based range partitioned table can often have different numbers of
rows per partition, giving inconsistent response times. A
height-balanced approach where we make the partitions all the same size,
yet vary the data value boundaries will give much more consistent query
times and can be completely automated much more easily.

The SVM concept doesn't cover everything that you can do with
partitioning, but my feeling is it covers the main use cases well. If
that's not true, in broad strokes or in the detail, then we need to
uncover that. Everybody's help in doing that is appreciated, whatever
the viewpoint and whatever the outcome.

It's probably worth examining existing applications to see how well they
would migrate to segmented tables approach. The following query will
analyse one column of a table to produce a list of boundary values,
given a segment size of 131072 blocks (1 GB).

select
substr(ctid::text,2,strpos(ctid::text,',')-2)::integer/131072 as seg,
min(PrimaryKey), max(PrimaryKey)
from bigtable
group by seg;

We should be able to see whether this works for existing use cases, or
not fairly easily.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

In response to

Re: Dynamic Partitioning using Segment Visibility Maps at 2008-01-05 03:31:14 from Robert Treat

Responses

Re: Dynamic Partitioning using Segment Visibility Maps at 2008-01-05 12:45:47 from tomas
Re: Dynamic Partitioning using Segment Visibility Maps at 2008-01-05 15:42:28 from Markus Schiltknecht

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Markus Schiltknecht	2008-01-05 10:13:12	Re: Dynamic Partitioning using Segment Visibility Maps
Previous Message	Warren Turkal	2008-01-05 07:23:56	Re: timestamp typedefs