Re: Dynamic Partitioning using Segment Visibility Maps

From: Markus Schiltknecht <markus(at)bluegap(dot)ch>
To: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org, Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
Subject: Re: Dynamic Partitioning using Segment Visibility Maps
Date: 2008-01-05 19:02:41
Message-ID: 477FD451.1010406@bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Robert Treat wrote:
> Personally I cant say it complicates things, because it isn't clear how it
> will be managed. :-)

Well, management of relations is easy enough, known to the DBA and most
importantly: it already exists. Having to set up something which is
*not* tied to a relation complicates things just because it's an
additional concept.

But as I've pointed out, maybe what we have in mind isn't that different
at all. Just have a sentinel relation mean a set of segments, i.e. all
read-only segments of a table. Then again, a table - in a way - is not
much else than a set of segments. So where's the real difference?

>> To satisfy all the different requirements of partitioning with segments
>> based partitioning, we'd have to allow a table to span multiple table
>> spaces. I'm not very keen on going that way.
>>
>
> Why?

Uh.. if a table (RELKIND_RELATION) can only span one table space, as it
is now, all of its segments are in the same table space. I don't quite
call that partitioning. Well, sure, you could call it so, but then, each
and every Postgres table is already partitioned in 1G segments.

It all depends on the definitions, but in my world, horizontal
partitioning for databases involves multiple table spaces (and is quite
useless without that). Calling anything else partitioning is confusing, IMO.

> So the one thing that always scares me about these "define it all and let the
> database sort it out" methods is they seem to lead to cases where the system
> ends up rewriting the data to fit into some new partition layout.

That holds true no matter if you shuffle between segments or relations.
To be able to let the DBA define an exact split point, the database
*will* have to shuffle tuples around. Why does that scare you? It's a
regular database system's maintenance procedure.

> One thing
> that is nice about the current partitioning scheme is you can control the
> impact of this behavior in these scenarios, but moving around small portions
> of the table at a time.

Uh.. I'm not quite following. What "current partitioning scheme" are you
referring to?

Why should that not be possible with other schemes? Moving the split
point between two partitions involves moving tuples around, no matter if
you are going to move them between segments or between relations
building the partitions.

> More to the point (I think) is that people define access to the data based on
> the meaning of the data, not how it is stored on disk. For example, in some
> tables we only need to be active on 1 months worth of data... how that is
> laid out on disk (# partitions, which tablespaces) is a means to the end of
> working actively on 1 months worth of data. I can't think of many cases where
> people would actually say the want to work actively on the most recent GB of
> data.

Agreed. I'd say that's why the DBA needs to be able to define the split
point between partitions: only he knows the meaning of the data.

>> To me, both of SVM and
>> SE look much more like an optimization for certain special cases and
>> don't have much to do with partitioning.
>
> Even if this were true, it might still be a useful optimization.

Possibly, yes. To me, the use case seems pretty narrow, though. For
example it doesn't affect index scans much.

> One table I
> am thinking of in particular in my system has one query we need to run across
> partitions, which ends up doing a slew of bitmap index scans for all the
> partitions. If using segment exclusion on it meant that I could get a global
> index to help that query, I'd be happy.

As proposed, Segment Exclusion works only on exactly one table. Thus, if
you already have your data partitioned into multiple relations, it most
probably won't affect your setup much. It certainly has nothing to do
with what I understand by 'global index' (that's an index spanning
multiple tables, right?).

Regards

Markus

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Mielke 2008-01-05 19:14:53 Re: SSL over Unix-domain sockets
Previous Message Tom Lane 2008-01-05 17:39:08 Re: SSL over Unix-domain sockets