Re: complex query performance assistance request

From: John Mendenhall <john(at)surfutopia(dot)net>
To: pgsql-performance list <pgsql-performance(at)postgresql(dot)org>
Subject: Re: complex query performance assistance request
Date: 2005-08-22 18:21:38
Message-ID: 20050822182138.GA30818@calvin.surfutopia.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sat, 20 Aug 2005, John Mendenhall wrote:

> I need to improve the performance for the following
> query.

I have run the same query in the same database under
different schemas. Each schema is pretty much the same
tables and indices. One has an extra backup table and
an extra index which are not used in either of the explain
analyze plans.

The first schema is a development schema, which I used
to performance tune the server so everything was great.

Here are the current results of the sql run in the development
environment:

-----
LOG: duration: 852.275 ms statement: explain analyze
SELECT
c.id AS contact_id,
sr.id AS sales_rep_id,
p.id AS partner_id,
coalesce(LTRIM(RTRIM(c.company)), LTRIM(RTRIM(c.firstname || ' ' || c.lastname))) AS contact_company,
co.name AS contact_country,
c.master_key_token
FROM
sales_reps sr
JOIN partners p ON (sr.id = p.sales_rep_id)
JOIN contacts c ON (p.id = c.partner_id)
JOIN countries co ON (LOWER(c.country) = LOWER(co.code))
JOIN partner_classification pc ON (p.classification_id = pc.id AND pc.classification != 'Sales Rep')
WHERE
c.lead_deleted IS NULL
AND EXISTS
(
SELECT
lr.id
FROM
lead_requests lr,
lead_request_status lrs
WHERE
c.id = lr.contact_id AND
lr.status_id = lrs.id AND
lrs.is_closed = 0
)
ORDER BY
contact_company, contact_id
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------
Sort (cost=18238.25..18238.27 rows=11 width=102) (actual time=823.721..823.915 rows=247 loops=1)
Sort Key: COALESCE(ltrim(rtrim((c.company)::text)), ltrim(rtrim((((c.firstname)::text || ' '::text) || (c.lastname)::text)))), c.id
-> Hash Join (cost=18230.34..18238.06 rows=11 width=102) (actual time=808.042..818.427 rows=247 loops=1)
Hash Cond: (lower(("outer".code)::text) = lower(("inner".country)::text))
-> Seq Scan on countries co (cost=0.00..4.42 rows=242 width=19) (actual time=0.032..1.208 rows=242 loops=1)
-> Hash (cost=18230.31..18230.31 rows=9 width=95) (actual time=807.554..807.554 rows=0 loops=1)
-> Merge Join (cost=18229.98..18230.31 rows=9 width=95) (actual time=794.413..804.855 rows=247 loops=1)
Merge Cond: ("outer".sales_rep_id = "inner".id)
-> Sort (cost=18227.56..18227.59 rows=9 width=95) (actual time=793.132..793.502 rows=250 loops=1)
Sort Key: p.sales_rep_id
-> Merge Join (cost=18227.26..18227.42 rows=9 width=95) (actual time=782.832..789.205 rows=250 loops=1)
Merge Cond: ("outer".id = "inner".classification_id)
-> Sort (cost=1.05..1.05 rows=2 width=10) (actual time=0.189..0.194 rows=2 loops=1)
Sort Key: pc.id
-> Seq Scan on partner_classification pc (cost=0.00..1.04 rows=2 width=10) (actual time=0.089..0.127 rows=2 loops=1)
Filter: ((classification)::text <> 'Sales Rep'::text)
-> Sort (cost=18226.21..18226.24 rows=13 width=105) (actual time=782.525..782.818 rows=251 loops=1)
Sort Key: p.classification_id
-> Merge Join (cost=0.00..18225.97 rows=13 width=105) (actual time=54.135..776.299 rows=449 loops=1)
Merge Cond: ("outer".id = "inner".partner_id)
-> Index Scan using partners_pkey on partners p (cost=0.00..30.80 rows=395 width=30) (actual time=0.073..6.873 rows=395 loops=1)
-> Index Scan using contacts_partner_id_idx on contacts c (cost=0.00..130157.20 rows=93 width=85) (actual time=0.366..739.783 rows=453 loops=1)
Filter: ((lead_deleted IS NULL) AND (subplan))
SubPlan
-> Nested Loop (cost=0.00..6.75 rows=2 width=10) (actual time=0.103..0.103 rows=0 loops=5576)
Join Filter: ("outer".status_id = "inner".id)
-> Index Scan using lead_requests_contact_id_idx on lead_requests lr (cost=0.00..4.23 rows=2 width=20) (actual time=0.075..0.075 rows=0 loops=5576)
Index Cond: ($0 = contact_id)
-> Seq Scan on lead_request_status lrs (cost=0.00..1.16 rows=8 width=10) (actual time=0.028..0.098 rows=4 loops=522)
Filter: (is_closed = 0::numeric)
-> Sort (cost=2.42..2.52 rows=39 width=10) (actual time=1.183..1.569 rows=268 loops=1)
Sort Key: sr.id
-> Seq Scan on sales_reps sr (cost=0.00..1.39 rows=39 width=10) (actual time=0.056..0.353 rows=39 loops=1)
Total runtime: 826.425 ms
(34 rows)
-----

Here is the current run in the production environment,
which I need to figure out how to get to the performance
level of the development environment:

-----
LOG: duration: 6447.934 ms statement: explain analyze
SELECT
c.id AS contact_id,
sr.id AS sales_rep_id,
p.id AS partner_id,
coalesce(LTRIM(RTRIM(c.company)), LTRIM(RTRIM(c.firstname || ' ' || c.lastname))) AS contact_company,
co.name AS contact_country,
c.master_key_token
FROM
sales_reps sr
JOIN partners p ON (sr.id = p.sales_rep_id)
JOIN contacts c ON (p.id = c.partner_id)
JOIN countries co ON (LOWER(c.country) = LOWER(co.code))
JOIN partner_classification pc ON (p.classification_id = pc.id AND pc.classification != 'Sales Rep')
WHERE
c.lead_deleted IS NULL
AND EXISTS
(
SELECT
lr.id
FROM
lead_requests lr,
lead_request_status lrs
WHERE
c.id = lr.contact_id AND
lr.status_id = lrs.id AND
lrs.is_closed = 0
)
ORDER BY
contact_company, contact_id
QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------
Sort (cost=40838.98..40849.08 rows=4042 width=102) (actual time=6418.732..6419.536 rows=1071 loops=1)
Sort Key: COALESCE(ltrim(rtrim((c.company)::text)), ltrim(rtrim((((c.firstname)::text || ' '::text) || (c.lastname)::text)))), c.id
-> Merge Join (cost=40442.25..40596.85 rows=4042 width=102) (actual time=6357.161..6389.616 rows=1071 loops=1)
Merge Cond: ("outer"."?column3?" = "inner"."?column9?")
-> Sort (cost=14.00..14.61 rows=242 width=19) (actual time=9.753..10.018 rows=240 loops=1)
Sort Key: lower((co.code)::text)
-> Seq Scan on countries co (cost=0.00..4.42 rows=242 width=19) (actual time=0.126..3.950 rows=242 loops=1)
-> Sort (cost=40428.24..40436.59 rows=3340 width=95) (actual time=6347.154..6348.429 rows=1071 loops=1)
Sort Key: lower((c.country)::text)
-> Merge Join (cost=75.65..40232.76 rows=3340 width=95) (actual time=60.308..6331.266 rows=1071 loops=1)
Merge Cond: ("outer".partner_id = "inner".id)
-> Index Scan using contacts_partner_id_idx on contacts c (cost=0.00..161018.18 rows=20120 width=85) (actual time=2.769..6188.886 rows=1548 loops=1)
Filter: ((lead_deleted IS NULL) AND (subplan))
SubPlan
-> Nested Loop (cost=1.16..6.57 rows=2 width=10) (actual time=0.129..0.129 rows=0 loops=40262)
Join Filter: ("outer".status_id = "inner".id)
-> Index Scan using lead_requests_contact_id_idx on lead_requests lr (cost=0.00..4.86 rows=3 width=20) (actual time=0.086..0.092 rows=0 loops=40262)
Index Cond: ($0 = contact_id)
-> Materialize (cost=1.16..1.24 rows=8 width=10) (actual time=0.002..0.013 rows=6 loops=12593)
-> Seq Scan on lead_request_status lrs (cost=0.00..1.16 rows=8 width=10) (actual time=0.078..0.243 rows=7 loops=1)
Filter: (is_closed = 0::numeric)
-> Sort (cost=75.65..76.37 rows=290 width=20) (actual time=57.243..59.574 rows=1334 loops=1)
Sort Key: p.id
-> Merge Join (cost=59.24..63.79 rows=290 width=20) (actual time=33.975..42.215 rows=395 loops=1)
Merge Cond: ("outer".id = "inner".sales_rep_id)
-> Sort (cost=2.42..2.52 rows=39 width=10) (actual time=1.206..1.285 rows=39 loops=1)
Sort Key: sr.id
-> Seq Scan on sales_reps sr (cost=0.00..1.39 rows=39 width=10) (actual time=0.028..0.365 rows=39 loops=1)
-> Sort (cost=56.82..57.55 rows=290 width=20) (actual time=32.566..33.254 rows=395 loops=1)
Sort Key: p.sales_rep_id
-> Nested Loop (cost=24.35..44.96 rows=290 width=20) (actual time=0.158..25.227 rows=395 loops=1)
Join Filter: ("inner".classification_id = "outer".id)
-> Seq Scan on partner_classification pc (cost=0.00..1.04 rows=2 width=10) (actual time=0.050..0.096 rows=2 loops=1)
Filter: ((classification)::text <> 'Sales Rep'::text)
-> Materialize (cost=24.35..28.70 rows=435 width=30) (actual time=0.028..6.617 rows=435 loops=2)
-> Seq Scan on partners p (cost=0.00..24.35 rows=435 width=30) (actual time=0.042..9.941 rows=435 loops=1)
Total runtime: 6423.683 ms
(37 rows)
-----

The SQL is exactly the same.

The issue is the query plan is different, and thus,
not up to the performance we need.

We have 256meg in the machine. Would it help if
we threw some more memory in?

Please let me know if you have *any* pointers as to
the reason for the difference.

Thank you very much in advance for any pointers or
suggestions.

JohnM

--
John Mendenhall
john(at)surfutopia(dot)net
surf utopia
internet services

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2005-08-22 19:12:56 Re: MemoryContextSwitchTo during table scan?
Previous Message Luke Lonergan 2005-08-22 17:46:33 Re: MemoryContextSwitchTo during table scan?