Re: Dynamic Partitioning using Segment Visibility Maps

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Markus Schiltknecht <markus(at)bluegap(dot)ch>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Dynamic Partitioning using Segment Visibility Maps
Date: 2008-01-04 13:40:03
Message-ID: 1199454003.18598.87.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2008-01-04 at 13:29 +0100, Markus Schiltknecht wrote:

> Given that we are operating on segments here, to which the DBA has very
> limited information and access, I prefer the term "Segment Exclusion". I
> think of that as an optimization of sequential scans on tables with the
> above mentioned characteristics.
>
> > If we do need to differentiate between the two proposals, we can refer
> > to this one as the Segment Visibility Map (SVM).
>
> I'm clearly in favor of separating between the two proposals. SVM is a
> good name, IMHO.

OK, I'll refer to this as proposal as SVM.

> > There would be additional complexity in selectivity estimation and plan
> > costing. The above proposal allows dynamic segment exclusion, which
> > cannot be assessed at planning time anyway, so suggestions welcome...
>
> Hm.. that looks like a rather bad downside of an executor-only optimization.

I think that's generally true. We already have that problem with planned
statements and work_mem, for example, and parameterised query planning
is a difficult problem. Stable functions are already estimated at plan
time, so we hopefully should be getting that right. I don't see any show
stoppers here, just more of the usual problems of query optimization.

> > Comparison with other Partitioning approaches
> > ---------------------------------------------
> >
> > Once I realised this was possible in fairly automatic way, I've tried
> > hard to keep away from manual overrides, commands and new DDL.
> >
> > Declarative partitioning is a big overhead, though worth it for large
> > databases. No overhead is *much* better though.
> >
> > This approach to partitioning solves the following challenges
> > - allows automated table extension, so works automatically with Slony
> > - responds dynamically to changing data
> > - allows stable functions, nested loop joins and parametrised queries
> > - allows RI via SHARE locks
> > - avoids the need for operator push-down through Append nodes
> > - allows unique indexes
> > - allows both global indexes (because we only have one table)
> > - allows advanced planning/execution using read-only/visible data
> > - works naturally with synchronous scans and buffer recycling
> >
> > All of the above are going to take considerably longer to do in any of
> > the other ways I've come up with so far...
>
> I fully agree. But as I tried to point out above, the gains in
> manageability from Segment Exclusion are also pretty close to zero. So
> I'd argue they only fulfill parts of the needs for general horizontal
> partitioning.

Agreed.

My focus for this proposal wasn't manageability, as it had been in other
recent proposals. I think there are some manageability wins to be had as
well, but we need to decide what sort of partitioning we want/need
first.

So in the case of SVM, enhanced manageability is really a phase 2 thing.

Plus, you can always combine a design with constraint and segment
exclusion.

> Maybe a combination with CLUSTERing would be worthwhile? Or even
> enforced CLUSTERing for the older segments?

I think there's merit in Heikki's maintain cluster order patch and that
should do an even better job of maintaining locality.

Thanks for detailed comments. I'll do my best to include all of the
viewpoints you've expressed as the design progresses.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2008-01-04 16:13:21 SSL over Unix-domain sockets
Previous Message Glyn Astill 2008-01-04 13:23:18 Problem with PgTcl auditing function on trigger