From: | "Francisco Reyes" <lists(at)stringsutils(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Any way to favor index scans, but not bitmap index scans? |
Date: | 2008-07-23 20:12:10 |
Message-ID: | 0b8afb1dff45c26725b204a62de0a4be@stringsutils.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 3:37 pm 07/23/08 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> "Francisco Reyes" <lists(at)stringsutils(dot)com> writes:
> > SET ENABLE_SEQSCAN TO OFF;
> > SET ENABLE_BITMAPSCAN TO OFF;
> > Aggregate (cost=25665216.10..25665216.11 rows=1 width=12) (actual
> > time=3088.894..3088.896 rows=1 loops=1)
> > -> Nested Loop (cost=0.00..25662307.70 rows=387785 width=12)
> > (actual time=0.264..2624.680 rows=194734 loops=1)
> > -> Index Scan using join_ids_join_id on join_ids
> > (cost=0.00..2867051.21 rows=5020 width=4) (actual
> > time=0.237..1236.019 rows=4437 loops=1)
> > Filter: (customer_id = ANY ('{1014,2485,4636,4635,12
> 55,547,374,580}'::integer[]))
> > -> Index Scan using historical_join_id_date on historical
> > (cost=0.00..4522.43 rows=1477 width=16) (actual
> > time=0.010..0.153 rows=44 loops=4437)
> > Index Cond: ((historical.join_id =
> > join_ids.join_id) AND (historical.date > '2007-04-01'::date)
> > AND (historical.date < '2008-05-01'::date))
> > Filter: (trans.f5 > 0::numeric)
> > Total runtime: 3091.227 ms --> 3 seconds
>
> You might be more likely to get a sane plan if you had an index on
> join_ids.customer_id.
There is an index in join_ids:
joinids_customerids_joinid" btree (customer_id, joinid) WITH (fillfactor=98)
Also, that plan is only 3 seconds. That is as good as that is going to get.
Or where you refering that the other plans would have been better?
> The first indexscan above is really a
> completely silly choice, and would never have been used if you
> weren't holding a gun to the planner's head.
I have much to learn about how to properly read an explain analyze, but as
silly as that plan may look it outperforms the other plans by orders of
magnitude. 3 seconds vs 12 minutes is a very big difference. It was so fast
that I even compared the results (which happens to be a single row) to make
sure I was getting the correct value.
>The index isn't contributing any selectivity at all.
Which index scan? Are analyze read bottom up right?
If it is this one you are refering to:
-> Index Scan using historical_join_id_date on historical
> > (cost=0.00..4522.43 rows=1477 width=16) (actual
> > time=0.010..0.153 rows=44 loops=4437)
> > Index Cond: ((historical.join_id =
> > join_ids.join_id) AND (historical.date > '2007-04-01'::date)
> > AND (historical.date < '2008-05-01'::date))
> > Filter: (trans.f5 > 0::numeric)
I believe that is the reason performance is good with that plan.
The number of rows that need to be returned from historical is less than 1%.
> The other part of the problem is the factor-of-thirty overestimate of
> the number of rows that the inner indexscan will produce (which means
> also a factor-of-thirty overestimate of its cost). Perhaps higher
> statistics targets for these two relations would give you a better
> estimate there.
Is it possible to go over
default_statistics_target = 1000?
> since there's no additional join condition. What PG version are you
> running exactly?
8.3.3
I have only been at this job for 3 months and I can say that neither the
data, nor the previous design I am trying to replace play nice with
postgresql. I can't get into specifics, but I can say that our "historical"
tables have about 60% data that is not used in most queries. I think that
is partly what throws off the planner so much. My first clue was when I saw
the planner trying to do sequential scans to retrieve less than 1% of rows.
It didn't make sense.
I tried several schemes with partitioning and that was even worse.
I am going to convert the tables structure names to the mapping names I
used in these thread. Perhaps that may be informative.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2008-07-23 20:23:09 | Re: Equality search on timestamp value returns no rows |
Previous Message | dpage | 2008-07-23 19:58:02 | Re: mac install question |