Outer joins and Seq scans

From: Sami Dalouche <skoobi(at)free(dot)fr>
To: pgsql-performance(at)postgresql(dot)org
Subject: Outer joins and Seq scans
Date: 2007-10-28 21:25:10
Message-ID: 1193606710.9008.12.camel@samlaptop
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

I have a query that uses left outer join, and this seems to prevent the
index on the right column to be used.

I couldn't really trim down the query without having the index used
normally..

So, I have the following tables that join :

Offer -> AdCreatedEvent -> Account -> ContactInformation -> City ->
GisFeature
and
Offer -> ResidenceDescription -> City -> GisFeature

The query is at the end of the email.

What happens is that the "ContactInformation -> City " outer join link
prevents postgres from using the index on City.

If I only join offer -> ResidenceDescription -> City -> GisFeature, the
index is used;
If I join everything in the query without the GIS condition, the index
is used :
gf2_.location && setSRID(cast ('BOX3D(1.5450494105576016
48.73176862850233,3.1216171894423983 49.00156477149768)'as box3d), 4326)
AND distance_sphere(gf2_.location, GeomFromText('POINT(2.3333333
48.8666667)',4326)) <= 15000 limit 10 offset 10

Also, if I replace the ContactInformation -> City link by an inner
join, the index is used.

With the outer join,
explain analyze tells me :
Hash (cost=37037.86..37037.86 rows=2331986 width=16)
-> Seq Scan on city city1_
(cost=0.00..37037.86 rows=2331986 width=16)

Whereas the inner join tells me :
-> Index Scan using cityid on city city8_ (cost=0.00..8.52 rows=1
width=16)

So, what could prevent postgrs from using the index ? I ran all the
vacuum analyze stuff, and the stats cannot possibly tell postgres that
it's not worth using the index (2 million entries in the city table, way
more than in any other table).

Which options do I have to force postgres to use an index here ?

Thanks for your help,
Sami Dalouche

---------------
select * from Offer this_ inner join AdCreatedEvent ace3_ on
this_.adCreatedEvent_id=ace3_.id left outer join FunalaEvent ace3_1_ on
ace3_.id=ace3_1_.id left outer join Account account6_ on
ace3_.eventInitiator_id=account6_.id left outer join ContactInformation
contactinf7_ on account6_.contactInformation_id=contactinf7_.id left
outer join City city8_ on contactinf7_.city_id=city8_.id inner join
ResidenceDescription residenced19_ on
this_.residenceDescription_id=residenced19_.id inner join City city1_ on
residenced19_.city_id=city1_.id inner join GisFeature gf2_ on
city1_.associatedGisFeature_id=gf2_.id left outer join ResidenceType
residencet22_ on residenced19_.residenceType_id=residencet22_.id where
gf2_.location && setSRID(cast ('BOX3D(1.5450494105576016
48.73176862850233,3.1216171894423983 49.00156477149768)'as box3d), 4326)
AND distance_sphere(gf2_.location, GeomFromText('POINT(2.3333333
48.8666667)',4326)) <= 15000 limit 10 offset 10

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2007-10-28 22:08:34 Re: Outer joins and Seq scans
Previous Message Heikki Linnakangas 2007-10-28 18:22:04 Re: Append Cost in query planners