Re: Query performance problems with partitioned tables

From: Andreas Haumer <andreas(at)xss(dot)co(dot)at>
To: Guillaume Cottenceau <gc(at)mnc(dot)ch>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Query performance problems with partitioned tables
Date: 2007-04-30 13:29:30
Message-ID: 4635EF3A.1070204@xss.co.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

Guillaume Cottenceau schrieb:
> Andreas Haumer <andreas 'at' xss.co.at> writes:
[...]
>
>> Now my question is: Does the query planner in the case of partitioned tables
>> really have to scan all indexes in order to get the next timestamp smaller
>> (or larger) than a given one?
>
> Well, how can the planner know inside which partition the wanted
> row is? There might be no data, say, inside a couple of
> partitions in the past before finding the wanted row, in which
> case 3 partitions in the past must be scanned.
>

I think the planner could do the following:

a) It could make a better decision in which direction to scan
the partitions (depending on sort order involved in the query)

b) It could stop scanning as soon as there can not be any further
resulting row according to the CHECK constraints given on the tables.

Currently it doesn't do this.

Look at this example:

testdb_part=> select ts from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00' order by ts asc limit 1;
ts
- ------------------------
2006-01-01 01:00:00+01
(1 row)

testdb_part=> explain analyze select ts from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00' order by ts asc limit 1;
QUERY PLAN
- --------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=15843.41..15843.41 rows=1 width=8) (actual time=152.476..152.478 rows=1 loops=1)
-> Sort (cost=15843.41..15865.39 rows=8795 width=8) (actual time=152.472..152.472 rows=1 loops=1)
Sort Key: mwdb.t_mv.ts
-> Result (cost=0.00..15267.23 rows=8795 width=8) (actual time=0.102..122.540 rows=11629 loops=1)
-> Append (cost=0.00..15267.23 rows=8795 width=8) (actual time=0.098..76.140 rows=11629 loops=1)
-> Index Scan using pk_mv_zr_ts on t_mv (cost=0.00..8.27 rows=1 width=8) (actual time=0.022..0.022 rows=0 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200601 on t_mv_200601 t_mv (cost=0.00..986.73 rows=582 width=8) (actual time=0.070..2.136 rows=743 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200602 on t_mv_200602 t_mv (cost=0.00..847.40 rows=497 width=8) (actual time=0.066..2.063 rows=672 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200603 on t_mv_200603 t_mv (cost=0.00..961.33 rows=565 width=8) (actual time=0.063..2.115 rows=743 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200604 on t_mv_200604 t_mv (cost=0.00..901.09 rows=528 width=8) (actual time=0.156..2.200 rows=720 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200605 on t_mv_200605 t_mv (cost=0.00..945.38 rows=555 width=8) (actual time=0.052..2.088 rows=744 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200606 on t_mv_200606 t_mv (cost=0.00..995.58 rows=587 width=8) (actual time=0.054..1.869 rows=720 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200607 on t_mv_200607 t_mv (cost=0.00..983.15 rows=578 width=8) (actual time=0.045..1.989 rows=744 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200608 on t_mv_200608 t_mv (cost=0.00..976.05 rows=573 width=8) (actual time=0.048..1.877 rows=744 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200609 on t_mv_200609 t_mv (cost=0.00..902.86 rows=529 width=8) (actual time=0.054..2.225 rows=720 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200610 on t_mv_200610 t_mv (cost=0.00..934.74 rows=548 width=8) (actual time=0.034..2.671 rows=745 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200611 on t_mv_200611 t_mv (cost=0.00..913.50 rows=536 width=8) (actual time=0.053..2.302 rows=720 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200612 on t_mv_200612 t_mv (cost=0.00..983.15 rows=578 width=8) (actual time=0.059..2.449 rows=744 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200701 on t_mv_200701 t_mv (cost=0.00..929.43 rows=545 width=8) (actual time=0.034..2.035 rows=744 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200702 on t_mv_200702 t_mv (cost=0.00..863.33 rows=506 width=8) (actual time=0.034..1.675 rows=672 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200703 on t_mv_200703 t_mv (cost=0.00..925.87 rows=542 width=8) (actual time=0.055..2.036 rows=743 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200704 on t_mv_200704 t_mv (cost=0.00..1209.39 rows=545 width=8) (actual time=0.061..2.296 rows=711 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
Total runtime: 153.195 ms
(40 rows)

Table t_mv_200601 gets scanned first, which is fine.

This already gives a row matching the given WHERE clause.
It makes no sense to scan the other tables, as the query
asks for one row only and all the other tables have timestamps
larger than all the timestamps in table t_mv_200601 (according
to the CHECK constraints for the partion tables)

The same would be true with the following query using an aggregate function
(perhaps this is a better example for my reasoning):

testdb_part=> select min(ts) from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00';
min
- ------------------------
2006-01-01 01:00:00+01
(1 row)

testdb_part=> explain analyze select min(ts) from mwdb.t_mv where zr=3622 and ts > '2006-01-01 00:00:00';
QUERY PLAN
- --------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=15289.22..15289.23 rows=1 width=8) (actual time=106.735..106.737 rows=1 loops=1)
-> Append (cost=0.00..15267.23 rows=8795 width=8) (actual time=0.184..78.174 rows=11629 loops=1)
-> Index Scan using pk_mv_zr_ts on t_mv (cost=0.00..8.27 rows=1 width=8) (actual time=0.035..0.035 rows=0 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200601 on t_mv_200601 t_mv (cost=0.00..986.73 rows=582 width=8) (actual time=0.143..2.207 rows=743 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200602 on t_mv_200602 t_mv (cost=0.00..847.40 rows=497 width=8) (actual time=0.020..1.709 rows=672 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200603 on t_mv_200603 t_mv (cost=0.00..961.33 rows=565 width=8) (actual time=0.033..2.076 rows=743 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200604 on t_mv_200604 t_mv (cost=0.00..901.09 rows=528 width=8) (actual time=0.027..2.039 rows=720 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200605 on t_mv_200605 t_mv (cost=0.00..945.38 rows=555 width=8) (actual time=0.031..2.109 rows=744 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200606 on t_mv_200606 t_mv (cost=0.00..995.58 rows=587 width=8) (actual time=0.023..2.001 rows=720 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200607 on t_mv_200607 t_mv (cost=0.00..983.15 rows=578 width=8) (actual time=0.027..2.064 rows=744 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200608 on t_mv_200608 t_mv (cost=0.00..976.05 rows=573 width=8) (actual time=0.030..1.932 rows=744 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200609 on t_mv_200609 t_mv (cost=0.00..902.86 rows=529 width=8) (actual time=0.021..2.408 rows=720 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200610 on t_mv_200610 t_mv (cost=0.00..934.74 rows=548 width=8) (actual time=0.014..2.046 rows=745 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200611 on t_mv_200611 t_mv (cost=0.00..913.50 rows=536 width=8) (actual time=0.024..1.846 rows=720 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200612 on t_mv_200612 t_mv (cost=0.00..983.15 rows=578 width=8) (actual time=0.019..2.556 rows=744 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200701 on t_mv_200701 t_mv (cost=0.00..929.43 rows=545 width=8) (actual time=0.022..2.188 rows=744 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200702 on t_mv_200702 t_mv (cost=0.00..863.33 rows=506 width=8) (actual time=0.023..2.311 rows=672 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200703 on t_mv_200703 t_mv (cost=0.00..925.87 rows=542 width=8) (actual time=0.027..1.977 rows=743 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
-> Index Scan using pk_mv_200704 on t_mv_200704 t_mv (cost=0.00..1209.39 rows=545 width=8) (actual time=0.022..2.084 rows=711 loops=1)
Index Cond: (((zr)::integer = 3622) AND ((ts)::timestamp with time zone > '2006-01-01 00:00:00+01'::timestamp with time zone))
Total runtime: 107.152 ms
(37 rows)

As soon as the query found a "min(ts)" which is larger than
timestamp "2006-01-01 00:00:00" (the WHERE clause) in table
t_mv_200601, it can stop scanning, as there *can not* be any
timestamp smaller than the one already found in the other
tables (again, according to the CHECK constraints)!

Perhaps the logic to implement this is complex, but IMHO
it _should_ be doable (and proofable), shouldn't it?

In fact, the query planner already does partly select the tables
to scan in an intelligent way, because it does not scan the tables
with timestamps smaller than "2006-01-01 00:00:00", but IMHO it
still scans too much tables.

Comments?

- - andreas

- --
Andreas Haumer | mailto:andreas(at)xss(dot)co(dot)at
*x Software + Systeme | http://www.xss.co.at/
Karmarschgasse 51/2/20 | Tel: +43-1-6060114-0
A-1100 Vienna, Austria | Fax: +43-1-6060114-71
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGNe83xJmyeGcXPhERAk2fAJ98aqfKl7pQtac4HvSRr9GYbktadgCfU76J
ZmMj1A3UFejvS+2JrstrTaA=
=Myeo
-----END PGP SIGNATURE-----

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Huxton 2007-04-30 13:45:05 Re: Query performance problems with partitioned tables
Previous Message Guillaume Cottenceau 2007-04-30 13:05:06 Re: Query performance problems with partitioned tables