Re: query optimization scenarios 17,701 times faster!!!

From: "Robert Dyas" <rdyas(at)adelphia(dot)net>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: query optimization scenarios 17,701 times faster!!!
Date: 2003-04-24 14:12:31
Message-ID: MGEFJOBFIEAIADIKAMEKOEJECIAA.rdyas@adelphia.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


OK, so you lack mind-reading skills ;-) I wasn't sure if anybody would be
interested in exploring this, but here's the full monty. (full EXPLAIN
ANALYZE output + entire schema) I can't provide the actual data, sorry. But
each table has at most a couple of hundred records. The numbers are actually
a little worse this time (33,623 times faster!!!) because there is slightly
more data in the tables. Performance degrades very rapidly as table size
increases.

EXPLAIN ANALYZE SELECT DISTINCT organizations.city, organizations.inactive,
organizations.java_developers, organizations.name, organizations.org_id,
organizations.overview, organizations.phone, organizations.salesperson,
organizations.state, organizations.time_zone FROM organizations LEFT OUTER
JOIN org_milestones ON (org_milestones.org_id = organizations.org_id) LEFT
OUTER JOIN contacts ON (contacts.org_id = organizations.org_id) LEFT OUTER
JOIN tasks ON (tasks.org_id = organizations.org_id) WHERE
(organizations.org_id = 71) ORDER BY organizations.name
covont_production-# ;

QUERY PLAN
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
-------------------------
Unique (cost=53.32..54.08 rows=3 width=884) (actual
time=19200.24..24870.87 rows=1 loops=1)
-> Sort (cost=53.32..53.39 rows=27 width=884) (actual
time=19200.19..19315.98 rows=840 loops=1)
Sort Key: organizations.name, organizations.city,
organizations.inactive, organizations.java_developers, organizations.org_id,
organizations.overview, organizations.phone, organizations.salesperson,
organizations.state, organizations.time_zone
-> Hash Join (cost=43.68..52.67 rows=27 width=884) (actual
time=18.42..170.69 rows=840 loops=1)
Hash Cond: ("outer".org_id = "inner".org_id)
-> Hash Join (cost=12.38..20.75 rows=5 width=880) (actual
time=7.18..19.42 rows=42 loops=1)
Hash Cond: ("outer".org_id = "inner".org_id)
-> Nested Loop (cost=0.00..8.24 rows=2 width=876)
(actual time=1.03..6.16 rows=7 loops=1)
Join Filter: ("inner".org_id = "outer".org_id)
-> Index Scan using organizations_pkey on
organizations (cost=0.00..4.66 rows=1 width=872) (actual time=0.32..0.34
rows=1 loops=1)
Index Cond: (org_id = 71)
-> Seq Scan on org_milestones (cost=0.00..2.15
rows=115 width=4) (actual time=0.04..2.97 rows=116 loops=1)
-> Hash (cost=8.36..8.36 rows=136 width=4) (actual
time=5.44..5.44 rows=0 loops=1)
-> Seq Scan on contacts (cost=0.00..8.36
rows=136 width=4) (actual time=0.05..3.21 rows=136 loops=1)
-> Hash (cost=19.92..19.92 rows=292 width=4) (actual
time=10.50..10.50 rows=0 loops=1)
-> Seq Scan on tasks (cost=0.00..19.92 rows=292
width=4) (actual time=0.06..6.38 rows=294 loops=1)
Total runtime: 24881.28 msec
(17 rows)

covont_production=# EXPLAIN ANALYZE SELECT organizations.city,
organizations.inactive, organizations.java_developers, organizations.name,
organizations.org_id, organizations.overview, organizations.phone,
organizations.salesperson, organizations.state, organizations.time_zone FROM
organizations WHERE (organizations.org_id = 71)
covont_production-# ;
QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------------------
Index Scan using organizations_pkey on organizations (cost=0.00..4.66
rows=1 width=872) (actual time=0.28..0.29 rows=1 loops=1)
Index Cond: (org_id = 71)
Total runtime: 0.74 msec
(3 rows)

covont_production=#

****************************************************************************
**
Here is the complete schema:

CREATE TABLE organizations (
org_id SERIAL PRIMARY KEY,
salesperson INTEGER REFERENCES users(user_id),
name VARCHAR(30) NOT NULL,
phone VARCHAR(30),
city VARCHAR(30),
state VARCHAR(2),
time_zone VARCHAR(3),
overview TEXT,
java_developers INTEGER,
inactive BOOLEAN NOT NULL
);

CREATE INDEX organizations_salesperson ON organizations(salesperson);

CREATE TABLE milestones (
milestone_id SERIAL PRIMARY KEY,
name VARCHAR(60) NOT NULL,
red_flag_days INTEGER NOT NULL
);

CREATE TABLE org_milestones (
id SERIAL PRIMARY KEY,
milestone_id INTEGER NOT NULL REFERENCES milestones(milestone_id),
org_id INTEGER NOT NULL REFERENCES organizations(org_id),
completed_on DATE NOT NULL,
notes VARCHAR(250)
);

CREATE TABLE contacts (
contact_id SERIAL PRIMARY KEY,
org_id INTEGER NOT NULL REFERENCES organizations(org_id),
first_name VARCHAR(30) NOT NULL,
last_name VARCHAR(30),
title VARCHAR(60),
phone VARCHAR(30),
email_address VARCHAR(120),
assistant_name VARCHAR(30),
assistant_phone VARCHAR(30),
assistant_email VARCHAR(120),
functional_role VARCHAR(30),
reports_to INTEGER REFERENCES contacts(contact_id),
notes TEXT,
pain VARCHAR(120),
reasons VARCHAR(250),
needs VARCHAR(250),
address_1 VARCHAR(60),
address_2 VARCHAR(60),
city VARCHAR(30),
state VARCHAR(2),
zip_code VARCHAR(10),
time_zone VARCHAR(3)
);

CREATE INDEX contacts_org_id ON contacts(org_id);

CREATE TABLE tasks (
task_id SERIAL PRIMARY KEY,
org_id INTEGER NOT NULL REFERENCES organizations(org_id),
contact_id INTEGER NOT NULL REFERENCES contacts(contact_id),
created_by INTEGER NOT NULL REFERENCES users(user_id),
start_date DATE NOT NULL,
start_time TIME,
completed BOOLEAN NOT NULL,
task_type VARCHAR(30) NOT NULL,
objective VARCHAR(120) NOT NULL,
outcome VARCHAR(120),
notes TEXT,
priority INTEGER NOT NULL
);

CREATE INDEX tasks_org_id ON tasks(org_id);
CREATE INDEX tasks_contact_id ON tasks(contact_id);
CREATE INDEX tasks_start_date ON tasks(start_date);

CREATE TABLE attachments (
attach_id SERIAL PRIMARY KEY,
task_id INTEGER NOT NULL REFERENCES tasks(task_id) ON DELETE
CASCADE,
attachment BYTEA
);

CREATE INDEX attachments_task_id ON attachments(task_id);

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Wednesday, April 23, 2003 11:00 PM
To: Robert Dyas
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] query optimization scenarios 17,701 times
faster!!!

"Robert Dyas" <rdyas(at)adelphia(dot)net> writes:
> The following is a list of query pairs (one fast, one slow) that must
> produce identical results by definition (and do), but have very different
> execution times.

With no details about the table schemas, nor the EXPLAIN ANALYZE output
data, I really wonder how you expect any intelligent comments. We are
programmers, not mind-readers.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Page 2003-04-24 14:22:51 Re: putting new pdf docs on ftp site
Previous Message Hannu Krosing 2003-04-24 13:40:35 Re: Unrelated question