Re: JOIN not being calculated correctly

From: Andrew Hammond <ahammond(at)ca(dot)afilias(dot)info>
To: Scott Pederick <scott(at)pederick(dot)com>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: JOIN not being calculated correctly
Date: 2004-11-02 18:46:41
Message-ID: 4187D611.2000109@ca.afilias.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Posting the EXPLAIN is a good, an EXPLAIN ANALYZE would be better
(assuming your dataset is small enough for it to complete in this
lifetime). You also need to include the following information:

1) The schema involved, including information about indexes being used.

2) Have you vacuumed / analyzed the tables involved recently?

3) Have you modified the stats on any of the tables / columns involve or
are you using defaults?

Drew

Scott Pederick wrote:
| Hi all!
|
| I'm having a problem with Postgresql 8.0.0-beta4 on a Win2K machine and a
| particular join.
|
| I've got two tables - a list of customers and jobs they've had. A customer
| can have multiple jobs.
|
| The query always scans the entire jobs table for each customer - I need it
| the other way around so I can get a list of the customers who have at
least
| one job.
|
| The EXPLAIN shows the jobs table is being scanned for some reason:
|
| Quarry=# EXPLAIN SELECT Customers.CustomerId, Jobs.JobId FROM Customers
| INNER JOIN Jobs USING (CustomerId);
| QUERY PLAN
| -------------------------------------------------------------------------
| Hash Join (cost=78.54..4908.71 rows=70727 width=8)
| Hash Cond: ("outer".customerid = "inner".customerid)
| -> Seq Scan on jobs (cost=0.00..3769.27 rows=70727 width=8)
| -> Hash (cost=76.03..76.03 rows=1003 width=4)
| -> Seq Scan on customers (cost=0.00..76.03 rows=1003 width=4)
| (5 rows)
|
|
|
| Even if I reverse the JOIN I get the exact same result:
|
| Quarry=# EXPLAIN SELECT Customers.CustomerId, Jobs.JobId FROM Jobs INNER
| JOIN Customers USING (CustomerId);
| QUERY PLAN
| -------------------------------------------------------------------------
| Hash Join (cost=78.54..4908.71 rows=70727 width=8)
| Hash Cond: ("outer".customerid = "inner".customerid)
| -> Seq Scan on jobs (cost=0.00..3769.27 rows=70727 width=8)
| -> Hash (cost=76.03..76.03 rows=1003 width=4)
| -> Seq Scan on customers (cost=0.00..76.03 rows=1003 width=4)
| (5 rows)
|
|
| How can I force it to operate as I need it to? It seems the query
engine is
| a little smarter than it needs to be.
|
| If anyone can shed some light on this problem, it would be greatly
| appreciated. I've taken it as far as I can and don't really know where to
| move from here.

- --
Andrew Hammond 416-673-4138 ahammond(at)ca(dot)afilias(dot)info
Database Administrator, Afilias Canada Corp.
CB83 2838 4B67 D40F D086 3568 81FC E7E5 27AF 4A9A
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)

iD8DBQFBh9YQgfzn5SevSpoRAg0LAKCg5K7IccFIOvdTc8DEl2YaUMcUCgCfRt2Q
CI1Vo6yxHkrWcoTQMQ/EvOw=
=m15B
-----END PGP SIGNATURE-----

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message Andrew Hammond 2004-11-02 18:56:12 Re: [SQL] Log
Previous Message John B. Scalia 2004-11-02 15:44:41 query using a date field that isn't set