Re: sequential scan on child partition tables

From: Anj Adu <fotographs(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)postgresql(dot)org
Subject: Re: sequential scan on child partition tables
Date: 2009-10-15 20:51:45
Message-ID: f2fd819a0910151351t463ca20cg578862803105d7a7@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

This appears to be a bug in the optimizer with resepct to planning
queries involving child partitions. It is clear that "any" index is
being ignored even if the selectivity is high. I had to re-write the
same query by explicitly "union-all" ' ing the queries for individual
partitions.

On Wed, Oct 14, 2009 at 11:02 PM, Anj Adu <fotographs(at)gmail(dot)com> wrote:
> That..however is not how the data is distributed...the query is doing
> a sequential scan on "every" partition that is within the date
> constraint specified...i.e 2009-10-07 thru 2009-10-13......there is no
> data from 2009-10-14 onwards. The constraints when applied account for
> less than 25% of the data.
>
> When I replace the query with a "union all" of all specific
> partitions..the query runs very quickly. Below is the explain plan for
> the "union-all" version of the query.
>
>
> HashAggregate  (cost=59285.14..59285.93 rows=63 width=56) (actual
> time=276141.218..276141.378 rows=185 loops=1)
>   ->  Append  (cost=8496.41..59283.73 rows=63 width=32) (actual
> time=1012.844..276140.866 rows=185 loops=1)
>         ->  Subquery Scan "*SELECT* 1"  (cost=8496.41..8496.61 rows=9
> width=32) (actual time=1012.843..1012.910 rows=28 loops=1)
>               ->  HashAggregate  (cost=8496.41..8496.52 rows=9
> width=32) (actual time=1012.839..1012.865 rows=28 loops=1)
>                     ->  Hash Join  (cost=1250.35..8496.29 rows=9
> width=32) (actual time=97.599..990.893 rows=10316 loops=1)
>                           Hash Cond: ("outer".sentryid_id =
> "inner".sentryid_id)
>                           ->  Nested Loop  (cost=1234.84..8470.84
> rows=1971 width=32) (actual time=97.492..975.741 rows=17602 loops=1)
>                                 ->  Unique  (cost=1218.94..1219.05
> rows=1 width=8) (actual time=44.352..44.653 rows=29 loops=1)
>                                       ->  Sort
> (cost=1218.94..1219.00 rows=21 width=8) (actual time=44.351..44.469
> rows=307 loops=1)
>                                             Sort Key: ssa_tr_dy.source_ip_num
>                                             ->  Bitmap Heap Scan on
> ssa_tr_dy  (cost=98.52..1218.48 rows=21 width=8) (actual
> time=33.170..43.848 rows=307 loops=1)
>                                                   Recheck Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                                   Filter:
> ((sentryid)::text ~~ 'edmc%'::text)
>                                                   ->  Bitmap Index
> Scan on ssa_dn_tr_idx1  (cost=0.00..98.52 rows=4148 width=0) (actual
> time=16.999..16.999 rows=1230 loops=1)
>                                                         Index Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                 ->  Bitmap Heap Scan on
> pix_tr_dy_dimension_2009_10_07 a  (cost=15.90..7227.15 rows=1971
> width=32) (actual time=26.373..31.319 rows=607 loops=29)
>                                       Recheck Cond: (a.source_ip_num
> = "outer".source_ip_num)
>                                       ->  Bitmap Index Scan on
> pix_tr_dy_dimension_2009_10_07_source_ip_num  (cost=0.00..15.90
> rows=1971 width=0) (actual time=26.179..26.179 rows=607 loops=29)
>                                             Index Cond:
> (a.source_ip_num = "outer".source_ip_num)
>                           ->  Hash  (cost=15.50..15.50 rows=2
> width=4) (actual time=0.087..0.087 rows=3 loops=1)
>                                 ->  Seq Scan on pix_tr_sentryid sent
> (cost=0.00..15.50 rows=2 width=4) (actual time=0.061..0.067 rows=3
> loops=1)
>                                       Filter: ((sentryid)::text ~~
> 'edmc%'::text)
>         ->  Subquery Scan "*SELECT* 2"  (cost=9219.21..9219.44
> rows=10 width=32) (actual time=54867.609..54867.673 rows=30 loops=1)
>               ->  HashAggregate  (cost=9219.21..9219.34 rows=10
> width=32) (actual time=54867.605..54867.636 rows=30 loops=1)
>                     ->  Hash Join  (cost=1251.08..9219.09 rows=10
> width=32) (actual time=36.722..54826.062 rows=12975 loops=1)
>                           Hash Cond: ("outer".sentryid_id =
> "inner".sentryid_id)
>                           ->  Nested Loop  (cost=1235.58..9192.58
> rows=2180 width=32) (actual time=36.661..54800.027 rows=19624 loops=1)
>                                 ->  Unique  (cost=1218.94..1219.05
> rows=1 width=8) (actual time=1.807..2.153 rows=29 loops=1)
>                                       ->  Sort
> (cost=1218.94..1219.00 rows=21 width=8) (actual time=1.805..1.939
> rows=307 loops=1)
>                                             Sort Key: ssa_tr_dy.source_ip_num
>                                             ->  Bitmap Heap Scan on
> ssa_tr_dy  (cost=98.52..1218.48 rows=21 width=8) (actual
> time=0.739..1.478 rows=307 loops=1)
>                                                   Recheck Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                                   Filter:
> ((sentryid)::text ~~ 'edmc%'::text)
>                                                   ->  Bitmap Index
> Scan on ssa_dn_tr_idx1  (cost=0.00..98.52 rows=4148 width=0) (actual
> time=0.709..0.709 rows=1230 loops=1)
>                                                         Index Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                 ->  Bitmap Heap Scan on
> pix_tr_dy_dimension_2009_10_08 a  (cost=16.63..7946.27 rows=2180
> width=32) (actual time=28.162..1888.539 rows=677 loops=29)
>                                       Recheck Cond: (a.source_ip_num
> = "outer".source_ip_num)
>                                       ->  Bitmap Index Scan on
> pix_tr_dy_dimension_2009_10_08_source_ip_num  (cost=0.00..16.63
> rows=2180 width=0) (actual time=24.044..24.044 rows=677 loops=29)
>                                             Index Cond:
> (a.source_ip_num = "outer".source_ip_num)
>                           ->  Hash  (cost=15.50..15.50 rows=2
> width=4) (actual time=0.040..0.040 rows=3 loops=1)
>                                 ->  Seq Scan on pix_tr_sentryid sent
> (cost=0.00..15.50 rows=2 width=4) (actual time=0.029..0.036 rows=3
> loops=1)
>                                       Filter: ((sentryid)::text ~~
> 'edmc%'::text)
>         ->  Subquery Scan "*SELECT* 3"  (cost=8814.11..8814.31 rows=9
> width=32) (actual time=51634.494..51634.547 rows=24 loops=1)
>               ->  HashAggregate  (cost=8814.11..8814.22 rows=9
> width=32) (actual time=51634.490..51634.519 rows=24 loops=1)
>                     ->  Hash Join  (cost=1250.68..8814.00 rows=9
> width=32) (actual time=65.658..51599.190 rows=10962 loops=1)
>                           Hash Cond: ("outer".sentryid_id =
> "inner".sentryid_id)
>                           ->  Nested Loop  (cost=1235.18..8788.07
> rows=2067 width=32) (actual time=65.596..51574.491 rows=20261 loops=1)
>                                 ->  Unique  (cost=1218.94..1219.05
> rows=1 width=8) (actual time=1.781..2.104 rows=29 loops=1)
>                                       ->  Sort
> (cost=1218.94..1219.00 rows=21 width=8) (actual time=1.779..1.896
> rows=307 loops=1)
>                                             Sort Key: ssa_tr_dy.source_ip_num
>                                             ->  Bitmap Heap Scan on
> ssa_tr_dy  (cost=98.52..1218.48 rows=21 width=8) (actual
> time=0.730..1.455 rows=307 loops=1)
>                                                   Recheck Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                                   Filter:
> ((sentryid)::text ~~ 'edmc%'::text)
>                                                   ->  Bitmap Index
> Scan on ssa_dn_tr_idx1  (cost=0.00..98.52 rows=4148 width=0) (actual
> time=0.699..0.699 rows=1230 loops=1)
>                                                         Index Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                 ->  Bitmap Heap Scan on
> pix_tr_dy_dimension_2009_10_09 a  (cost=16.23..7543.17 rows=2067
> width=32) (actual time=27.327..1777.293 rows=699 loops=29)
>                                       Recheck Cond: (a.source_ip_num
> = "outer".source_ip_num)
>                                       ->  Bitmap Index Scan on
> pix_tr_dy_dimension_2009_10_09_source_ip_num  (cost=0.00..16.23
> rows=2067 width=0) (actual time=23.784..23.784 rows=699 loops=29)
>                                             Index Cond:
> (a.source_ip_num = "outer".source_ip_num)
>                           ->  Hash  (cost=15.50..15.50 rows=2
> width=4) (actual time=0.040..0.040 rows=3 loops=1)
>                                 ->  Seq Scan on pix_tr_sentryid sent
> (cost=0.00..15.50 rows=2 width=4) (actual time=0.029..0.036 rows=3
> loops=1)
>                                       Filter: ((sentryid)::text ~~
> 'edmc%'::text)
>         ->  Subquery Scan "*SELECT* 4"  (cost=9686.21..9686.45
> rows=11 width=32) (actual time=33707.854..33707.900 rows=24 loops=1)
>               ->  HashAggregate  (cost=9686.21..9686.34 rows=11
> width=32) (actual time=33707.851..33707.874 rows=24 loops=1)
>                     ->  Hash Join  (cost=1252.67..9686.07 rows=11
> width=32) (actual time=37.055..33679.711 rows=7580 loops=1)
>                           Hash Cond: ("outer".sentryid_id =
> "inner".sentryid_id)
>                           ->  Nested Loop  (cost=1237.17..9658.71
> rows=2349 width=32) (actual time=37.001..33662.042 rows=11414 loops=1)
>                                 ->  Unique  (cost=1218.94..1219.05
> rows=1 width=8) (actual time=1.903..2.273 rows=29 loops=1)
>                                       ->  Sort
> (cost=1218.94..1219.00 rows=21 width=8) (actual time=1.901..2.045
> rows=307 loops=1)
>                                             Sort Key: ssa_tr_dy.source_ip_num
>                                             ->  Bitmap Heap Scan on
> ssa_tr_dy  (cost=98.52..1218.48 rows=21 width=8) (actual
> time=0.788..1.570 rows=307 loops=1)
>                                                   Recheck Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                                   Filter:
> ((sentryid)::text ~~ 'edmc%'::text)
>                                                   ->  Bitmap Index
> Scan on ssa_dn_tr_idx1  (cost=0.00..98.52 rows=4148 width=0) (actual
> time=0.755..0.755 rows=1230 loops=1)
>                                                         Index Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                 ->  Bitmap Heap Scan on
> pix_tr_dy_dimension_2009_10_10 a  (cost=18.22..8410.29 rows=2349
> width=32) (actual time=14.229..1160.004 rows=394 loops=29)
>                                       Recheck Cond: (a.source_ip_num
> = "outer".source_ip_num)
>                                       ->  Bitmap Index Scan on
> pix_tr_dy_dimension_2009_10_10_source_ip_num  (cost=0.00..18.22
> rows=2349 width=0) (actual time=10.351..10.351 rows=394 loops=29)
>                                             Index Cond:
> (a.source_ip_num = "outer".source_ip_num)
>                           ->  Hash  (cost=15.50..15.50 rows=2
> width=4) (actual time=0.037..0.037 rows=3 loops=1)
>                                 ->  Seq Scan on pix_tr_sentryid sent
> (cost=0.00..15.50 rows=2 width=4) (actual time=0.028..0.033 rows=3
> loops=1)
>                                       Filter: ((sentryid)::text ~~
> 'edmc%'::text)
>         ->  Subquery Scan "*SELECT* 5"  (cost=7840.14..7840.32 rows=8
> width=32) (actual time=27276.689..27276.734 rows=22 loops=1)
>               ->  HashAggregate  (cost=7840.14..7840.24 rows=8
> width=32) (actual time=27276.685..27276.713 rows=22 loops=1)
>                     ->  Hash Join  (cost=1248.86..7840.04 rows=8
> width=32) (actual time=109.205..27254.938 rows=6134 loops=1)
>                           Hash Cond: ("outer".sentryid_id =
> "inner".sentryid_id)
>                           ->  Nested Loop  (cost=1233.35..7815.29
> rows=1832 width=32) (actual time=109.142..27240.746 rows=9592 loops=1)
>                                 ->  Unique  (cost=1218.94..1219.05
> rows=1 width=8) (actual time=1.784..2.128 rows=29 loops=1)
>                                       ->  Sort
> (cost=1218.94..1219.00 rows=21 width=8) (actual time=1.782..1.922
> rows=307 loops=1)
>                                             Sort Key: ssa_tr_dy.source_ip_num
>                                             ->  Bitmap Heap Scan on
> ssa_tr_dy  (cost=98.52..1218.48 rows=21 width=8) (actual
> time=0.724..1.457 rows=307 loops=1)
>                                                   Recheck Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                                   Filter:
> ((sentryid)::text ~~ 'edmc%'::text)
>                                                   ->  Bitmap Index
> Scan on ssa_dn_tr_idx1  (cost=0.00..98.52 rows=4148 width=0) (actual
> time=0.694..0.694 rows=1230 loops=1)
>                                                         Index Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                 ->  Bitmap Heap Scan on
> pix_tr_dy_dimension_2009_10_11 a  (cost=14.41..6573.33 rows=1832
> width=32) (actual time=19.473..938.665 rows=331 loops=29)
>                                       Recheck Cond: (a.source_ip_num
> = "outer".source_ip_num)
>                                       ->  Bitmap Index Scan on
> pix_tr_dy_dimension_2009_10_11_source_ip_num  (cost=0.00..14.41
> rows=1832 width=0) (actual time=16.342..16.342 rows=331 loops=29)
>                                             Index Cond:
> (a.source_ip_num = "outer".source_ip_num)
>                           ->  Hash  (cost=15.50..15.50 rows=2
> width=4) (actual time=0.039..0.039 rows=3 loops=1)
>                                 ->  Seq Scan on pix_tr_sentryid sent
> (cost=0.00..15.50 rows=2 width=4) (actual time=0.028..0.034 rows=3
> loops=1)
>                                       Filter: ((sentryid)::text ~~
> 'edmc%'::text)
>         ->  Subquery Scan "*SELECT* 6"  (cost=7850.10..7850.28 rows=8
> width=32) (actual time=62773.885..62773.954 rows=27 loops=1)
>               ->  HashAggregate  (cost=7850.10..7850.20 rows=8
> width=32) (actual time=62773.880..62773.917 rows=27 loops=1)
>                     ->  Hash Join  (cost=1248.80..7850.00 rows=8
> width=32) (actual time=62.896..62719.489 rows=15348 loops=1)
>                           Hash Cond: ("outer".sentryid_id =
> "inner".sentryid_id)
>                           ->  Nested Loop  (cost=1233.30..7825.34
> rows=1815 width=32) (actual time=62.814..62687.076 rows=20370 loops=1)
>                                 ->  Unique  (cost=1218.94..1219.05
> rows=1 width=8) (actual time=1.912..2.330 rows=29 loops=1)
>                                       ->  Sort
> (cost=1218.94..1219.00 rows=21 width=8) (actual time=1.912..2.063
> rows=307 loops=1)
>                                             Sort Key: ssa_tr_dy.source_ip_num
>                                             ->  Bitmap Heap Scan on
> ssa_tr_dy  (cost=98.52..1218.48 rows=21 width=8) (actual
> time=0.765..1.583 rows=307 loops=1)
>                                                   Recheck Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                                   Filter:
> ((sentryid)::text ~~ 'edmc%'::text)
>                                                   ->  Bitmap Index
> Scan on ssa_dn_tr_idx1  (cost=0.00..98.52 rows=4148 width=0) (actual
> time=0.733..0.733 rows=1230 loops=1)
>                                                         Index Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                 ->  Bitmap Heap Scan on
> pix_tr_dy_dimension_2009_10_12 a  (cost=14.35..6583.60 rows=1815
> width=32) (actual time=39.032..2160.324 rows=702 loops=29)
>                                       Recheck Cond: (a.source_ip_num
> = "outer".source_ip_num)
>                                       ->  Bitmap Index Scan on
> pix_tr_dy_dimension_2009_10_12_source_ip_num  (cost=0.00..14.35
> rows=1815 width=0) (actual time=30.805..30.805 rows=702 loops=29)
>                                             Index Cond:
> (a.source_ip_num = "outer".source_ip_num)
>                           ->  Hash  (cost=15.50..15.50 rows=2
> width=4) (actual time=0.046..0.046 rows=3 loops=1)
>                                 ->  Seq Scan on pix_tr_sentryid sent
> (cost=0.00..15.50 rows=2 width=4) (actual time=0.035..0.041 rows=3
> loops=1)
>                                       Filter: ((sentryid)::text ~~
> 'edmc%'::text)
>         ->  Subquery Scan "*SELECT* 7"  (cost=7376.13..7376.31 rows=8
> width=32) (actual time=44866.931..44866.996 rows=30 loops=1)
>               ->  HashAggregate  (cost=7376.13..7376.23 rows=8
> width=32) (actual time=44866.927..44866.969 rows=30 loops=1)
>                     ->  Hash Join  (cost=1248.32..7376.03 rows=8
> width=32) (actual time=77.172..44826.884 rows=11881 loops=1)
>                           Hash Cond: ("outer".sentryid_id =
> "inner".sentryid_id)
>                           ->  Nested Loop  (cost=1232.81..7352.06
> rows=1677 width=32) (actual time=77.098..44803.266 rows=16665 loops=1)
>                                 ->  Unique  (cost=1218.94..1219.05
> rows=1 width=8) (actual time=1.816..2.149 rows=29 loops=1)
>                                       ->  Sort
> (cost=1218.94..1219.00 rows=21 width=8) (actual time=1.815..1.940
> rows=307 loops=1)
>                                             Sort Key: ssa_tr_dy.source_ip_num
>                                             ->  Bitmap Heap Scan on
> ssa_tr_dy  (cost=98.52..1218.48 rows=21 width=8) (actual
> time=0.718..1.491 rows=307 loops=1)
>                                                   Recheck Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                                   Filter:
> ((sentryid)::text ~~ 'edmc%'::text)
>                                                   ->  Bitmap Index
> Scan on ssa_dn_tr_idx1  (cost=0.00..98.52 rows=4148 width=0) (actual
> time=0.687..0.687 rows=1230 loops=1)
>                                                         Index Cond:
> (firstoccurrence > (now() - '8 days'::interval))
>                                 ->  Bitmap Heap Scan on
> pix_tr_dy_dimension_2009_10_13 a  (cost=13.87..6112.04 rows=1677
> width=32) (actual time=27.864..1543.963 rows=575 loops=29)
>                                       Recheck Cond: (a.source_ip_num
> = "outer".source_ip_num)
>                                       ->  Bitmap Index Scan on
> pix_tr_dy_dimension_2009_10_13_source_ip_num  (cost=0.00..13.87
> rows=1677 width=0) (actual time=23.339..23.339 rows=575 loops=29)
>                                             Index Cond:
> (a.source_ip_num = "outer".source_ip_num)
>                           ->  Hash  (cost=15.50..15.50 rows=2
> width=4) (actual time=0.050..0.050 rows=3 loops=1)
>                                 ->  Seq Scan on pix_tr_sentryid sent
> (cost=0.00..15.50 rows=2 width=4) (actual time=0.039..0.044 rows=3
> loops=1)
>                                       Filter: ((sentryid)::text ~~
> 'edmc%'::text)
>  Total runtime: 276143.006 ms
>
>
> On Wed, Oct 14, 2009 at 10:15 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Anj Adu <fotographs(at)gmail(dot)com> writes:
>>> This query is doing a sequential scan on the child partitions even
>>> though indexes on all constrained columns are present
>>
>> It looks to me like it's doing exactly what it is supposed to, ie,
>> indexscan on the partitions where it would help and seqscans on the
>> partitions where it wouldn't.  Indexscan is not better than seqscan
>> for retrieving all or most of a table.
>>
>>                        regards, tom lane
>>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message waldomiro 2009-10-16 03:33:47 There is a statistic table?
Previous Message Heikki Linnakangas 2009-10-15 20:50:21 Re: [PERFORM] Regarding facing lot of time Consumed by Socket.Poll()