Re: Declarative partitioning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Ildar Musin <i(dot)musin(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Declarative partitioning
Date: 2016-05-20 08:37:23
Message-ID: 573ECCC3.7080104@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers


Hi Ildar,

On 2016/05/19 0:36, Ildar Musin wrote:
>
> Here is an experimental patch that optimizes planning time for range
> partitioned tables (it could be considered as a "proof of concept"). Patch
> should be applied on top of Amit's declarative partitioning patch. It
> handles only a very special case (often used though) where partitioning
> key consists of just a single attribute and doesn't contain expressions.

Great, thanks!

I understand that it's still PoC and the point may be just to consider
performance implications of excessive partdesc copying but I'm wondering
about a few things about the patch in general. See below.

> The main idea is the following:
> * we are looking for clauses like 'VAR OP CONST' (where VAR is
> partitioning key attribute, OP is a comparison operator);
> * using binary search find a partition (X) that fits CONST value;
> * based on OP operator determine which partitions are also covered by
> clause. There are possible cases:
> 1. If OP is '<' or '<=' then we need partitions standing left from X
> (including)
> 2. If OP is '>' or '>=' then we need partitions standing right from X
> (including)
> 3. If OP is '=' the we need only X partition
> (for '<' and '>' operators we also check if CONST value is equal to a
> lower or upper boundary (accordingly) and if it's true then exclude X).
>
> For boolean expressions we evaluate left and right sides accordingly to
> algorithm above and then based on boolean operator find intersection (for
> AND) or union (for OR).

Perhaps you're already aware but may I also suggest looking at how clauses
are matched to indexes? For example, consider how
match_clauses_to_index() in src/backend/optimizer/path/indxpath.c works.

Moreover, instead of pruning partitions in planner prep phase, might it
not be better to do that when considering paths for the (partitioned) rel?
IOW, instead of looking at parse->jointree, we should rather be working
with rel->baserestrictinfo. Although, that would require some revisions
to how append_rel_list, simple_rel_list, etc. are constructed and
manipulated in a given planner invocation. Maybe it's time for that...
Again, you may have already considered these things.

> I run some benchmarks on:
> 1. original constraint exclusion mechanism,
> 2. optimized version (this patch) and
> 3. optimized version using relation->rd_partdesc pointer instead of
> RelationGetPartitionDesc() function (see previous discussion).
>
> Initial conditions:
>
> CREATE TABLE abc (id SERIAL NOT NULL, a INT, dt TIMESTAMP) PARTITION BY
> RANGE (a);
> CREATE TABLE abc_1 PARTITION OF abc FOR VALUES START (0) END (1000);
> CREATE TABLE abc_2 PARTITION OF abc FOR VALUES START (1000) END (2000);
> ...
> etc
> INSERT INTO %s (a) SELECT generate_series(0, <partitions_count> * 1000);
>
> pgbench scripts:
> https://gist.github.com/zilder/872e634a8eeb405bd045465fc9527e53 (where
> :partitions is a number of partitions).
> The first script tests fetching a single row from the partitioned table.
> Results (tps):
>
> # of partitions | constraint excl. | optimized | optimized (using
> pointer)
> ----------------+------------------+---------------+----------------------------
>
> 100 | 658 | 2906 | 3079
> 1000 | 45 | 2174 | 3021
> 2000 | 22 | 1667 | 2919
>
>
> The second script tests fetching all data from a single partition. Results
> (tps):
>
> # of partitions | constraint excl. | optimized | optimized (using
> pointer)
> ----------------+------------------+---------------+----------------------------
>
> 100 | 317 | 1001 | 1051
> 1000 | 34 | 941 | 1023
> 2000 | 15 | 813 | 1016
>
> Optimized version works much faster on large amount of partitions and
> degradates slower than constraint exclusion. But still there is a
> noticeable performance degradation from copying PartitionDesc structure:
> with 2000 partitions RelationGetPartitionDesc() function spent more than
> 40% of all execution time on copying in first benchmark (measured with
> `perf`). Using reference counting as Amit suggests will allow to
> significantily decrease performance degradation.

Could you try with the attached updated set of patches? I changed
partition descriptor relcache code to eliminate excessive copying in
previous versions.

Thanks,
Amit

Attachment Content-Type Size
0001-Add-syntax-to-specify-partition-key-v5.patch text/x-diff 41.5 KB
0002-Add-a-IGNORE-dependency-type-v5.patch text/x-diff 3.0 KB
0003-Infrastructure-for-creation-of-partitioned-tables-v5.patch text/x-diff 86.9 KB
0004-Add-syntax-to-create-partitions-v5.patch text/x-diff 77.8 KB
0005-Infrastructure-for-partition-metadata-storage-and-ma-v5.patch text/x-diff 101.7 KB
0006-Introduce-tuple-routing-for-partitioned-tables-v5.patch text/x-diff 29.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2016-05-20 12:12:24 Re: It's seems that the function "do_text_output_multiline" does not suit for format "line1\nline2\n...lineN".
Previous Message Craig Ringer 2016-05-20 07:35:17 Re: foreign table batch inserts