Re: Partitionement

From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Pierre BOIZOT <pierre(dot)boizot(at)gmail(dot)com>
Cc: PG-Mail-liste <pgsql-fr-generale(at)postgresql(dot)org>
Subject: Re: Partitionement
Date: 2013-10-06 21:00:20
Message-ID: 1381093220.5795.4.camel@localhost.localdomain
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-fr-generale

On Sun, 2013-10-06 at 17:37 +0200, Pierre BOIZOT wrote:
> Hello,
>
> J'ai fait le petit test suivant :
>
> Creation d'une table sales avec 4 partitions et deux critéres de
> partitionnement.
> un étant une date et l'autre un code.
>
> Il s'avere que l'écriture de la clause CHECK est importante ....
>
> CREATE TABLE sales_part1
> (CHECK (org = 6 and temps between to_date('20101001'::text,
> 'yyyymmdd'::text) and to_date('20131001'::text, 'yyyymmdd'::text)) )
> INHERITS (sales);
>
> transformé en :
> test=# \d+ sales_part1
> Table "public.sales_part1"
> Column | Type | Modifiers | Storage | Description
> --------+-----------------------------+-----------+----------+-------------
> org | integer | | plain |
> temps | timestamp without time zone | | plain |
> name | character varying(10) | | extended |
> Check constraints:
> "sales_part1_check" CHECK (org = 6 AND temps >=
> to_date('20101001'::text, 'yyyymmdd'::text) AND temps <=
> to_date('20131001'::text, 'yyyymmdd'::text))
> Inherits: sales
> Has OIDs: no
>
> ​fait que la requete suivante parcourt toutes les partitions.
>
> ​explain analyze select * from sales where temps >= date '20131101' and
> temps <= date '20141001';
> QUERY
> PLAN
> -------------------------------------------------------------------------------------------------------------------------
> Result (cost=0.00..103.73 rows=21 width=50) (actual time=0.082..0.296
> rows=5 loops=1)
> -> Append (cost=0.00..103.73 rows=21 width=50) (actual
> time=0.067..0.216 rows=5 loops=1)
> -> Seq Scan on sales (cost=0.00..2.53 rows=1 width=50) (actual
> time=0.019..0.019 rows=0 loops=1)
> Filter: ((temps >= '2013-11-01'::date) AND (temps <=
> '2014-10-01'::date))
> -> Seq Scan on sales_part1 sales (cost=0.00..25.30 rows=5
> width=50) (actual time=0.011..0.011 rows=0 loops=1)
> Filter: ((temps >= '2013-11-01'::date) AND (temps <=
> '2014-10-01'::date))
> -> Seq Scan on sales_part2 sales (cost=0.00..25.30 rows=5
> width=50) (actual time=0.009..0.023 rows=2 loops=1)
> Filter: ((temps >= '2013-11-01'::date) AND (temps <=
> '2014-10-01'::date))
> -> Seq Scan on sales_part3 sales (cost=0.00..25.30 rows=5
> width=50) (actual time=0.024..0.024 rows=0 loops=1)
> Filter: ((temps >= '2013-11-01'::date) AND (temps <=
> '2014-10-01'::date))
> -> Seq Scan on sales_part4 sales (cost=0.00..25.30 rows=5
> width=50) (actual time=0.013..0.034 rows=3 loops=1)
> Filter: ((temps >= '2013-11-01'::date) AND (temps <=
> '2014-10-01'::date))
> Total runtime: 0.410 ms​
>
> ​alors que si la contrainte est écrite comme ci-dessous :
> ​CREATE TABLE ventes_part1
> (CHECK ( (temps >= date '20101001' and temps <= date '20131001')
> and org=6 ) )
> INHERITS (ventes);​
>
>
> ​transformé en :
> d+
> ventes_part1
>
> Table "public.ventes_part1"
> Column | Type | Modifiers | Storage | Description
> --------+-----------------------------+-----------+----------+-------------
> org | integer | | plain |
> temps | timestamp without time zone | | plain |
> name | character varying(10) | | extended |
> Check constraints:
> "ventes_part1_check" CHECK (temps >= '2010-10-01'::date AND temps <=
> '2013-10-01'::date AND org = 6)
> Inherits: ventes
> Has OIDs: no​
>
> ​la meme requete ne parcourera que deux partitions :-)​
>
> ​test=# explain analyse select * from ventes where temps >= date
> '20131101' and temps <= date '20141001';
> QUERY
> PLAN
> ---------------------------------------------------------------------------------------------------------------------------
> Result (cost=0.00..53.13 rows=11 width=50) (actual time=0.056..0.169
> rows=5 loops=1)
> -> Append (cost=0.00..53.13 rows=11 width=50) (actual
> time=0.046..0.120 rows=5 loops=1)
> -> Seq Scan on ventes (cost=0.00..2.53 rows=1 width=50) (actual
> time=0.023..0.023 rows=0 loops=1)
> Filter: ((temps >= '2013-11-01'::date) AND (temps <=
> '2014-10-01'::date))
> -> Seq Scan on ventes_part2 ventes (cost=0.00..25.30 rows=5
> width=50) (actual time=0.011..0.021 rows=2 loops=1)
> Filter: ((temps >= '2013-11-01'::date) AND (temps <=
> '2014-10-01'::date))
> -> Seq Scan on ventes_part4 ventes (cost=0.00..25.30 rows=5
> width=50) (actual time=0.009..0.022 rows=3 loops=1)
> Filter: ((temps >= '2013-11-01'::date) AND (temps <=
> '2014-10-01'::date))
> Total runtime: 0.235 ms
> (9 rows)
> ​
> si on ajoute un critere de filtre supplémentaire à la clause where .
> on retrouve le meme comportement sur le parcours des partitions.
>
> Parcours d'une partition
> test=# explain analyse select * from ventes where temps >= date '20131101'
> and temps <= date '20141001' and org=7;
> QUERY
> PLAN
> ---------------------------------------------------------------------------------------------------------------------------
> Result (cost=0.00..30.64 rows=2 width=50) (actual time=0.028..0.069
> rows=3 loops=1)
> -> Append (cost=0.00..30.64 rows=2 width=50) (actual time=0.022..0.048
> rows=3 loops=1)
> -> Seq Scan on ventes (cost=0.00..2.79 rows=1 width=50) (actual
> time=0.009..0.009 rows=0 loops=1)
> Filter: ((temps >= '2013-11-01'::date) AND (temps <=
> '2014-10-01'::date) AND (org = 7))
> -> Seq Scan on ventes_part4 ventes (cost=0.00..27.85 rows=1
> width=50) (actual time=0.005..0.014 rows=3 loops=1)
> Filter: ((temps >= '2013-11-01'::date) AND (temps <=
> '2014-10-01'::date) AND (org = 7))
> Total runtime: 0.108 ms
> (7 rows)
>
> Parcours de deux partitions.
> test=# explain analyse select * from sales where temps >= date '20131101'
> and temps <= date '20141001' and org=7;
> QUERY
> PLAN
> -------------------------------------------------------------------------------------------------------------------------
> Result (cost=0.00..58.48 rows=3 width=50) (actual time=0.049..0.103
> rows=3 loops=1)
> -> Append (cost=0.00..58.48 rows=3 width=50) (actual time=0.041..0.074
> rows=3 loops=1)
> -> Seq Scan on sales (cost=0.00..2.79 rows=1 width=50) (actual
> time=0.011..0.011 rows=0 loops=1)
> Filter: ((temps >= '2013-11-01'::date) AND (temps <=
> '2014-10-01'::date) AND (org = 7))
> -> Seq Scan on sales_part3 sales (cost=0.00..27.85 rows=1
> width=50) (actual time=0.007..0.007 rows=0 loops=1)
> Filter: ((temps >= '2013-11-01'::date) AND (temps <=
> '2014-10-01'::date) AND (org = 7))
> -> Seq Scan on sales_part4 sales (cost=0.00..27.85 rows=1
> width=50) (actual time=0.007..0.020 rows=3 loops=1)
> Filter: ((temps >= '2013-11-01'::date) AND (temps <=
> '2014-10-01'::date) AND (org = 7))
> Total runtime: 0.177 ms
>
> En conclusion ....
> Cela marche mais il est impératif d'éxaminer soigneusement la requete la
> clause CHECK des tables.
>
> je mesure la sensibilité :-(
>

Oui, c'est clair que c'est sensible.

> Merci .
>
> *Question subsidiaire* : il me semble avoir lu dans la documentation que
> l'on ne devrait pas aller au delà de 100 partitions.
>
> Est-ce encore d'actualité ?
>

Tu peux aller au-dessus, il n'y a rien en dur qui l'en empêche. Vu la
sensibilité du système, c'est pas forcément une bonne idée :)

--
Guillaume
http://blog.guillaume.lelarge.info
http://www.dalibo.com

In response to

Browse pgsql-fr-generale by date

  From Date Subject
Next Message Dimitri Fontaine 2013-10-06 21:56:02 Re: Partitionement
Previous Message Pierre BOIZOT 2013-10-06 15:50:33 Re: psql : extension des commandes ...