Skip site navigation (1) Skip section navigation (2)

Re: JOIN not being calculated correctly

From: "Scott Pederick" <sql(dot)postgresql(dot)org(at)pederick(dot)com>
To: <pgsql-sql(at)postgresql(dot)org>
Subject: Re: JOIN not being calculated correctly
Date: 2004-11-20 21:37:54
Message-ID: 000501c4cf49$31c346e0$e1f9f4dc@boblaptop (view raw or flat)
Thread:
Lists: pgsql-sql
Hi Andrew,

Thanks for your response - sorry it's taken so long to reply, I've been out
of contact for a while.

Still can't fathom why it's not using an index scan on the jobs table..

: Posting the EXPLAIN is a good, an EXPLAIN ANALYZE would be better
: (assuming your dataset is small enough for it to complete in this
: lifetime). You also need to include the following information:
:
: 1) The schema involved, including information about indexes being used.

Here's the EXPLAIN ANALYZE, schemas and indexes:

Quarry=> EXPLAIN ANALYZE SELECT Customers.CustomerId, Jobs.JobId FROM
Customers INNER JOIN Jobs USING (CustomerId);
                                                      QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------
 Hash Join  (cost=78.54..7924.71 rows=70727 width=8) (actual
time=50.000..21941.000 rows=70727 loops=1)
   Hash Cond: ("outer".customerid = "inner".customerid)
   ->  Seq Scan on jobs  (cost=0.00..6785.27 rows=70727 width=8) (actual
time=40.000..21040.000 rows=70727 loops=1)
   ->  Hash  (cost=76.03..76.03 rows=1003 width=4) (actual
time=10.000..10.000 rows=0 loops=1)
         ->  Seq Scan on customers  (cost=0.00..76.03 rows=1003 width=4)
(actual time=0.000..0.000 rows=1003 loops=1)
 Total runtime: 22292.000 ms
(6 rows)


Customers
---
CREATE TABLE customers (
    customerid serial NOT NULL,
    account character varying(6) DEFAULT ''::character varying NOT NULL,
    businessname character varying(40) DEFAULT ''::character varying NOT
NULL,
    address character varying(30) DEFAULT ''::character varying NOT NULL,
    suburb character varying(30) DEFAULT ''::character varying NOT NULL,
    postcode character varying(4) DEFAULT ''::character varying NOT NULL,
    state character(3) DEFAULT ''::bpchar NOT NULL,
    mobile character varying(14) DEFAULT ''::character varying NOT NULL,
    fax character varying(14) DEFAULT ''::character varying NOT NULL,
    acctitle character varying(4) DEFAULT ''::character varying NOT NULL,
    accfirst character varying(14) DEFAULT ''::character varying NOT NULL,
    acclast character varying(14) DEFAULT ''::character varying NOT NULL,
    accphone character varying(14) DEFAULT ''::character varying NOT NULL,
    accfax character(1) DEFAULT ''::bpchar NOT NULL,
    accmobile character varying(14) DEFAULT ''::character varying NOT NULL,
    alttitle character varying(4) DEFAULT ''::character varying NOT NULL,
    altfirst character varying(14) DEFAULT ''::character varying NOT NULL,
    altlast character varying(14) DEFAULT ''::character varying NOT NULL,
    current numeric(10,2) DEFAULT 0 NOT NULL,
    aged30 numeric(10,2) DEFAULT 0 NOT NULL,
    aged60 numeric(10,2) DEFAULT 0 NOT NULL,
    aged90 numeric(10,2) DEFAULT 0 NOT NULL,
    ytd numeric(10,2) DEFAULT 0 NOT NULL,
    pg character(1) DEFAULT ''::bpchar NOT NULL,
    acclimit character varying(14) DEFAULT ''::character varying NOT NULL,
    accauto character(1) DEFAULT ''::bpchar NOT NULL,
    text text DEFAULT ''::text NOT NULL,
    fee character(1) DEFAULT ''::bpchar NOT NULL,
    gst character(1) DEFAULT ''::bpchar NOT NULL,
    stop character(1) DEFAULT ''::bpchar NOT NULL
);

Customers - Indexes
---
customers_pkey PRIMARY KEY (customerid);
UNIQUE INDEX customers_account ON customers USING btree (account);



Jobs
---
CREATE TABLE jobs (
    jobid serial NOT NULL,
    customerid integer,
    quarryid integer,
    producttypeid integer,
    invoiceid integer,
    salesid integer,
    orderid character varying(15) DEFAULT ''::character varying NOT NULL,
    jobdate date,
    contactname character varying(40) DEFAULT ''::character varying NOT
NULL,
    businessname character varying(40) DEFAULT ''::character varying NOT
NULL,
    address character varying(30) DEFAULT ''::character varying NOT NULL,
    suburb character varying(30) DEFAULT ''::character varying NOT NULL,
    phone character varying(14) DEFAULT ''::character varying NOT NULL,
    melway character varying(4) DEFAULT ''::character varying NOT NULL,
    melxy character(3) DEFAULT ''::bpchar NOT NULL,
    quarryorder numeric(10,2) DEFAULT 0 NOT NULL,
    quarrypickup numeric(10,2) DEFAULT 0 NOT NULL,
    quarryprice numeric(10,2) DEFAULT 0 NOT NULL,
    transport numeric(10,2) DEFAULT 0 NOT NULL,
    margin numeric(10,2) DEFAULT 0 NOT NULL,
    unit character(1) DEFAULT ''::bpchar NOT NULL,
    done character(1) DEFAULT ''::bpchar NOT NULL,
    xno character varying(10) DEFAULT ''::character varying NOT NULL,
    mzone character(1) DEFAULT ''::bpchar NOT NULL,
    invoice numeric(10,2) DEFAULT 0 NOT NULL,
    drivercomment character varying(12) DEFAULT ''::character varying NOT
NULL
    text text DEFAULT ''::text NOT NULL,
    gst numeric(10,2) DEFAULT 0 NOT NULL,
    distance numeric(7,2) DEFAULT 0 NOT NULL,
    productdefinitionid integer,
    customerprice numeric(10,2) DEFAULT 0 NOT NULL
);

Jobs Keys/Indexes
---
jobs_pkey PRIMARY KEY (jobid);
INDEX jobs_customerid ON jobs USING btree (customerid);
INDEX jobs_jobdate ON jobs USING btree (jobdate);
INDEX jobs_quarryid ON jobs USING btree (quarryid);

CONSTRAINT jobs_customerid_fkey FOREIGN KEY (customerid) REFERENCES
customers(customerid) ON UPDATE RESTRICT ON DELETE RESTRICT;
CONSTRAINT jobs_invoiceid_fkey FOREIGN KEY (invoiceid) REFERENCES
invoices(invoiceid) ON UPDATE RESTRICT ON DELETE RESTRICT;
ADD CONSTRAINT jobs_productdefinitionid_fkey FOREIGN KEY
(productdefinitionid) REFERENCES producttypes(producttypeid) ON UPDATE
RESTRICT;
CONSTRAINT jobs_producttypeid_fkey FOREIGN KEY (producttypeid) REFERENCES
producttypes(producttypeid) ON UPDATE RESTRICT ON DELETE RESTRICT;
CONSTRAINT jobs_quarryid_fkey FOREIGN KEY (quarryid) REFERENCES
quarries(quarryid) ON UPDATE RESTRICT ON DELETE RESTRICT;
CONSTRAINT jobs_salesid_fkey FOREIGN KEY (salesid) REFERENCES sales(salesid)
ON UPDATE RESTRICT ON DELETE RESTRICT;



: 2) Have you vacuumed / analyzed the tables involved recently?

Yes, I ran "VACUUM ANALYZE" on both table before I started.


: 3) Have you modified the stats on any of the tables / columns involve or
: are you using defaults?

I'm using the defaults...

Thanks in advance to anyone who has taken the time to wade through this
one... :)

Scott

:
: Drew
:
:
: Scott Pederick wrote:
: | Hi all!
: |
: | I'm having a problem with Postgresql 8.0.0-beta4 on a Win2K machine and
a
: | particular join.
: |
: | I've got two tables - a list of customers and jobs they've had. A
customer
: | can have multiple jobs.
: |
: | The query always scans the entire jobs table for each customer - I need
it
: | the other way around so I can get a list of the customers who have at
: least
: | one job.
: |
: | The EXPLAIN shows the jobs table is being scanned for some reason:
: |
: | Quarry=# EXPLAIN SELECT Customers.CustomerId, Jobs.JobId FROM Customers
: | INNER JOIN Jobs USING (CustomerId);
: |                                QUERY PLAN
:
| -------------------------------------------------------------------------
: |  Hash Join  (cost=78.54..4908.71 rows=70727 width=8)
: |    Hash Cond: ("outer".customerid = "inner".customerid)
: |    ->  Seq Scan on jobs  (cost=0.00..3769.27 rows=70727 width=8)
: |    ->  Hash  (cost=76.03..76.03 rows=1003 width=4)
: |          ->  Seq Scan on customers  (cost=0.00..76.03 rows=1003 width=4)
: | (5 rows)
: |
: |
: |
: | Even if I reverse the JOIN I get the exact same result:
: |
: | Quarry=# EXPLAIN SELECT Customers.CustomerId, Jobs.JobId FROM Jobs INNER
: | JOIN Customers USING (CustomerId);
: |                                QUERY PLAN
:
| -------------------------------------------------------------------------
: |  Hash Join  (cost=78.54..4908.71 rows=70727 width=8)
: |    Hash Cond: ("outer".customerid = "inner".customerid)
: |    ->  Seq Scan on jobs  (cost=0.00..3769.27 rows=70727 width=8)
: |    ->  Hash  (cost=76.03..76.03 rows=1003 width=4)
: |          ->  Seq Scan on customers  (cost=0.00..76.03 rows=1003 width=4)
: | (5 rows)
: |
: |
: | How can I force it to operate as I need it to? It seems the query
: engine is
: | a little smarter than it needs to be.
: |
: | If anyone can shed some light on this problem, it would be greatly
: | appreciated. I've taken it as far as I can and don't really know where
to
: | move from here.
:
:
: - --
: Andrew Hammond    416-673-4138    ahammond(at)ca(dot)afilias(dot)info
: Database Administrator, Afilias Canada Corp.
: CB83 2838 4B67 D40F D086 3568 81FC E7E5 27AF 4A9A
: -----BEGIN PGP SIGNATURE-----
: Version: GnuPG v1.2.5 (GNU/Linux)
:
: iD8DBQFBh9YQgfzn5SevSpoRAg0LAKCg5K7IccFIOvdTc8DEl2YaUMcUCgCfRt2Q
: CI1Vo6yxHkrWcoTQMQ/EvOw=
: =m15B
: -----END PGP SIGNATURE-----
:
: ---------------------------(end of broadcast)---------------------------
: TIP 3: if posting/reading through Usenet, please send an appropriate
:       subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
:       message can get through to the mailing list cleanly
:


In response to

Responses

pgsql-sql by date

Next:From: Tom LaneDate: 2004-11-20 21:51:34
Subject: Re: JOIN not being calculated correctly
Previous:From: Jeff EckermannDate: 2004-11-20 17:13:39
Subject: Re: How to get the max on a char column?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group