Re: postgres 8.2 seems to prefer Seq Scan

From: "Alex Deucher" <alexdeucher(at)gmail(dot)com>
To: "Michael Fuhr" <mike(at)fuhr(dot)org>
Cc: "PostgreSQL Performance" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: postgres 8.2 seems to prefer Seq Scan
Date: 2007-04-09 19:14:48
Message-ID: a728f9f90704091214h544c476bq2cdb24b31c5c63d0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 4/9/07, Alex Deucher <alexdeucher(at)gmail(dot)com> wrote:
> On 4/6/07, Michael Fuhr <mike(at)fuhr(dot)org> wrote:
> > On Fri, Apr 06, 2007 at 04:38:33PM -0400, Alex Deucher wrote:
> > > One more anomaly between 7.4 and 8.2. DB dumped from 7.4 and loaded
> > > onto 8.2, both have locale set to C. 8.2 seems to prefer Seq Scans
> > > for the first query while the ordering in the second query seems to
> > > perform worse on 8.2. I ran analyze. I've tried with the encoding
> > > set to UTF-8 and SQL_ASCII; same numbers and plans. Any ideas how to
> > > improve this?
> >
> > Are you sure the data sets are identical? The 7.4 query returned
> > 0 rows; the 8.2 query returned 1 row. If you're running the same
> > query against the same data in both versions then at least one of
> > them appears to be returning the wrong result. Exactly which
> > versions of 7.4 and 8.2 are you running?
> >
> > Have you analyzed all tables in both versions? The row count
> > estimate in 7.4 is much closer to reality than in 8.2:
> >
> > 7.4
> > > -> Index Scan using pnum_idx on event (cost=0.00..3.37 rows=19
> > > width=172) (actual time=0.063..0.063 rows=0 loops=1)
> > > Index Cond: ((pnum)::text = 'AB5819188'::text)
> >
> > 8.2
> > > -> Index Scan using pnum_idx on event (cost=0.00..3147.63
> > > rows=1779 width=171) (actual time=0.030..0.033 rows=1 loops=1)
> > > Index Cond: ((pnum)::text = 'AB5819188'::text)
> >
> > If analyzing the event table doesn't improve the row count estimate
> > then try increasing the statistics target for event.pnum and analyzing
> > again. Example:
> >
> > ALTER TABLE event ALTER pnum SET STATISTICS 100;
> > ANALYZE event;
> >
> > You can set the statistics target as high as 1000 to get more
> > accurate results at the cost of longer ANALYZE times.
> >
>
> Setting statistics to 400 seems to be the sweet spot. Values above
> that seem to only marginally improve performance. However, I have to
> disable seqscan in order for the query to be fast. Why does the query
> planner insist on doing a seq scan? Is there anyway to make it prefer
> the index scan?
>

FWIW, disabling seqscan also makes the second query much faster:

EXPLAIN ANALYZE select e.pnum, c.code_description, c.code_mcam,
e.event_pid from event e, code c where c.code_name =
e.ref_country_legal_code and c.code_country = e.ref_country and e.pnum
= 'US5819188';
QUERY
PLAN
-------------------------------------------------------------------------------------------------------------------------------------
Merge Join (cost=104.13..257.65 rows=1 width=73) (actual
time=0.038..0.038 rows=0 loops=1)
Merge Cond: ((c.code_country)::text = "inner"."?column5?")
Join Filter: ((c.code_name)::text = (e.ref_country_legal_code)::text)
-> Index Scan using code_country_idx on code c (cost=0.00..134.00
rows=2885 width=69) (actual time=0.012..0.012 rows=1 loops=1)
-> Sort (cost=104.13..104.27 rows=54 width=30) (actual
time=0.019..0.021 rows=1 loops=1)
Sort Key: (e.ref_country)::text
-> Index Scan using pnum_idx on event e (cost=0.00..102.58
rows=54 width=30) (actual time=0.010..0.012 rows=1 loops=1)
Index Cond: ((pnum)::text = 'US5819188'::text)
Total runtime: 0.072 ms
(9 rows)

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Spiegelberg, Greg 2007-04-09 20:01:53 DELETE with filter on ctid
Previous Message Merlin Moncure 2007-04-09 19:05:17 join to view over custom aggregate seems like it should be faster