Re: Query using SeqScan instead of IndexScan

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Brendan Duddridge <brendan(at)clickspace(dot)com>
Cc: PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Query using SeqScan instead of IndexScan
Date: 2006-03-31 15:59:12
Message-ID: 20060331155911.GP49405@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

What's the correlation of category_id? The current index scan cost
estimator places a heavy penalty on anything with a correlation much
below about 90%.

On Wed, Mar 29, 2006 at 08:12:28PM -0700, Brendan Duddridge wrote:
> Hi,
>
> I have a query that is using a sequential scan instead of an index
> scan. I've turned off sequential scans and it is in fact faster with
> the index scan.
>
> Here's my before and after.
>
> Before:
>
> ssdev=# SET enable_seqscan TO DEFAULT;
> ssdev=# explain analyze select cp.product_id
> from category_product cp, product_attribute_value pav
> where cp.category_id = 1001082 and cp.product_id =
> pav.product_id;
>
>
> QUERY PLAN
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> ------------------------------
> Hash Join (cost=25.52..52140.59 rows=5139 width=4) (actual
> time=4.521..2580.520 rows=19695 loops=1)
> Hash Cond: ("outer".product_id = "inner".product_id)
> -> Seq Scan on product_attribute_value pav (cost=0.00..40127.12
> rows=2387312 width=4) (actual time=0.039..1469.295 rows=2385846 loops=1)
> -> Hash (cost=23.10..23.10 rows=970 width=4) (actual
> time=2.267..2.267 rows=1140 loops=1)
> -> Index Scan using x_category_product__category_id_fk_idx
> on category_product cp (cost=0.00..23.10 rows=970 width=4) (actual
> time=0.122..1.395 rows=1140 loops=1)
> Index Cond: (category_id = 1001082)
> Total runtime: 2584.221 ms
> (7 rows)
>
>
> After:
>
> ssdev=# SET enable_seqscan TO false;
> ssdev=# explain analyze select cp.product_id
> from category_product cp, product_attribute_value pav
> where cp.category_id = 1001082 and cp.product_id =
> pav.product_id;
>
>
> QUERY PLAN
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> -------------------------------------
> Nested Loop (cost=0.00..157425.22 rows=5139 width=4) (actual
> time=0.373..71.177 rows=19695 loops=1)
> -> Index Scan using x_category_product__category_id_fk_idx on
> category_product cp (cost=0.00..23.10 rows=970 width=4) (actual
> time=0.129..1.438 rows=1140 loops=1)
> Index Cond: (category_id = 1001082)
> -> Index Scan using product_attribute_value__product_id_fk_idx
> on product_attribute_value pav (cost=0.00..161.51 rows=61 width=4)
> (actual time=0.016..0.053 rows=17 loops=1140)
> Index Cond: ("outer".product_id = pav.product_id)
> Total runtime: 74.747 ms
> (6 rows)
>
> There's quite a big difference in speed there. 2584.221 ms vs. 74.747
> ms.
>
> Any ideas what I can do to improve this without turning sequential
> scanning off?
>
> Thanks,
>
> ____________________________________________________________________
> Brendan Duddridge | CTO | 403-277-5591 x24 | brendan(at)clickspace(dot)com
>
> ClickSpace Interactive Inc.
> Suite L100, 239 - 10th Ave. SE
> Calgary, AB T2G 0V9
>
> http://www.clickspace.com
>

--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Frederic Back 2006-03-31 16:02:16 un-'vacuum analyse'
Previous Message Jim C. Nasby 2006-03-31 15:55:54 Re: Indexes with descending date columns