Slow Query

From: "Ozer, Pam" <pozer(at)automotive(dot)com>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Slow Query
Date: 2010-08-27 00:03:27
Message-ID: C3CA20D47A2DF94FBF9CF0AE8D964CC412438A@mail-001.corp.automotive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

I am new to Postgres and I am trying to understand the Explain Analyze
so I can tune the following query. I run the same query using mysql and
it takes less than 50ms. I run it on postgres and it takes 10 seconds.
I feel like I am missing something very obvious. (VehicleUsed is a big
table over 750,000records) and datasetgroupyearmakemodel has 150000
records.

It looks like the cost is highest in the Hash Join on Postalcode. Am
I reading this correctly.? I do have indexes on the lower(postalcode)
in both tables. Why wouldn't be using the index? Thanks in advance for
any help.

Here is my query:

select distinct VehicleMake.VehicleMake

from VehicleUsed

inner join PostalCodeRegionCountyCity on ( lower (
VehicleUsed.PostalCode ) = lower ( PostalCodeRegionCountyCity.PostalCode
) )

INNER JOIN DATASETGROUPYEARMAKEMODEL ON ( VEHICLEUSED.VEHICLEYEAR =
DATASETGROUPYEARMAKEMODEL.VEHICLEYEAR )

AND ( VEHICLEUSED.VEHICLEMAKEID =
DATASETGROUPYEARMAKEMODEL.VEHICLEMAKEID )

AND ( VEHICLEUSED.VEHICLEMODELID =
DATASETGROUPYEARMAKEMODEL.VEHICLEMODELID )

inner join VehicleMake on ( VehicleUsed.VehicleMakeId =
VehicleMake.VehicleMakeId )

where

( DatasetGroupYearMakeModel.DatasetGroupId = 3 ) and

( VehicleUsed.DatasetId <> 113 )

and ( VehicleUsed.ProductGroupId <> 13 )

and ( PostalCodeRegionCountyCity.RegionId = 36 )

order by VehicleMake.VehicleMake

limit 500000

Here is the explain analyze

"Limit (cost=38292.53..38293.19 rows=261 width=8) (actual
time=10675.857..10675.892 rows=42 loops=1)"

" -> Sort (cost=38292.53..38293.19 rows=261 width=8) (actual
time=10675.855..10675.868 rows=42 loops=1)"

" Sort Key: vehiclemake.vehiclemake"

" Sort Method: quicksort Memory: 18kB"

" -> HashAggregate (cost=38279.45..38282.06 rows=261 width=8)
(actual time=10675.710..10675.728 rows=42 loops=1)"

" -> Hash Join (cost=436.31..38270.51 rows=3576 width=8)
(actual time=4.471..10658.291 rows=10425 loops=1)"

" Hash Cond: (vehicleused.vehiclemakeid =
vehiclemake.vehiclemakeid)"

" -> Hash Join (cost=428.43..38213.47 rows=3576
width=4) (actual time=4.152..10639.742 rows=10425 loops=1)"

" Hash Cond:
(lower((vehicleused.postalcode)::text) =
lower((postalcoderegioncountycity.postalcode)::text))"

" -> Nested Loop (cost=101.81..37776.78
rows=11887 width=10) (actual time=1.172..9876.586 rows=382528 loops=1)"

" -> Bitmap Heap Scan on
datasetgroupyearmakemodel (cost=101.81..948.81 rows=5360 width=6)
(actual time=0.988..17.800 rows=5377 loops=1)"

" Recheck Cond: (datasetgroupid =
3)"

" -> Bitmap Index Scan on
datasetgroupyearmakemodel_i04 (cost=0.00..100.47 rows=5360 width=0)
(actual time=0.830..0.830 rows=5377 loops=1)"

" Index Cond: (datasetgroupid
= 3)"

" -> Index Scan using vehicleused_i10 on
vehicleused (cost=0.00..6.85 rows=1 width=12) (actual time=0.049..1.775
rows=71 loops=5377)"

" Index Cond:
((vehicleused.vehiclemodelid = datasetgroupyearmakemodel.vehiclemodelid)
AND (vehicleused.vehiclemakeid =
datasetgroupyearmakemodel.vehiclemakeid) AND (vehicleused.vehicleyear =
datasetgroupyearmakemodel.vehicleyear))"

" Filter: ((vehicleused.datasetid
<> 113) AND (vehicleused.productgroupid <> 13))"

" -> Hash (cost=308.93..308.93 rows=1416
width=6) (actual time=2.738..2.738 rows=1435 loops=1)"

" -> Bitmap Heap Scan on
postalcoderegioncountycity (cost=27.23..308.93 rows=1416 width=6)
(actual time=0.222..0.955 rows=1435 loops=1)"

" Recheck Cond: (regionid = 36)"

" -> Bitmap Index Scan on
postalcoderegioncountycity_i05 (cost=0.00..26.87 rows=1416 width=0)
(actual time=0.202..0.202 rows=1435 loops=1)"

" Index Cond: (regionid =
36)"

" -> Hash (cost=4.61..4.61 rows=261 width=10)
(actual time=0.307..0.307 rows=261 loops=1)"

" -> Seq Scan on vehiclemake (cost=0.00..4.61
rows=261 width=10) (actual time=0.033..0.154 rows=261 loops=1)"

"Total runtime: 10676.058 ms"

Pam Ozer

Data Architect

pozer(at)automotive(dot)com <mailto:pozer(at)automotive(dot)com>

tel. 949.705.3468

Source Interlink Media

1733 Alton Pkwy Suite 100, Irvine, CA 92606

www.simautomotive.com <http://www.simautomotive.com>

Confidentiality Notice- This electronic communication, and all
information herein, including files attached hereto, is private, and is
the property of the sender. This communication is intended only for the
use of the individual or entity named above. If you are not the intended
recipient, you are hereby notified that any disclosure of; dissemination
of; distribution of; copying of; or, taking any action in reliance upon
this communication, is strictly prohibited. If you have received this
communication in error, please immediately notify us by telephone,
(949)-705-3000, and destroy all copies of this communication. Thank you.

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Marlowe 2010-08-27 00:18:54 Re: Slow Query
Previous Message Scott Marlowe 2010-08-26 21:13:44 Re: [Fwd: postgres 8.4.1 number of connections]