Re: How to optimize monstrous query, sorts instead of using index

From: SZUCS Gábor <surrano(at)mailbox(dot)hu>
To: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: How to optimize monstrous query, sorts instead of using index
Date: 2003-06-25 14:15:53
Message-ID: 008e01c33b24$49f091b0$0403a8c0@fejleszt4
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Michael,

This whole query looks like a mess to me. Since I don't know the exact model
and the table stats, I don't even try to rewrite your query, however, here
are the weak points I can think of:

* as Rod pointed out, there are more tables in WHERE that aren't in FROM.
This can be a bug, but the very least, it makes the query far less readable.
These are:

monitorx
monitorstatus_statusitemsx.jdoidx
monitorstatusitemlistd8ea58a5x.jdoidx

* there are 3 index scans that basically steal your time.
They are 1.6..3.5 ms x 625 ~ 1..2 sec each (or I'm reading exp ana wrong,
I'm not an expert indeed):

- Index Scan using monitorstatus_stjdoidb742c9b3i on
monitorstatus_statusitemsx
(cost=0.00..3.01 rows=1 width=16)
(actual time=2.51..2.51 rows=1 loops=625)
Index Cond: ("outer".jdoidx = monitorstatus_statusitemsx.jdoidx)
- Index Scan using monitorstatusitejdoid7db0befci on
monitorstatusitemlistd8ea58a5x
(cost=0.00..3.01 rows=1 width=16)
(actual time=1.68..1.69 rows=1 loops=623)
Index Cond: ("outer".statusitemsx =
monitorstatusitemlistd8ea58a5x.jdoidx)
- Index Scan using monitorstatusitemx_pkey on monitorstatusitemx msi
(cost=0.00..3.01 rows=1 width=665)
(actual time=3.50..3.50 rows=1 loops=623)
Index Cond: ("outer".statusitemlistx = msi.jdoidx)

* another killer index: I think this one takes about the rest of the time
(i.e. 3-4 secs):

-> Index Scan using monitorstatusxmonitori on monitorstatusx ms
(cost=0.00..4695.65 rows=880 width=83)
(actual time=40.17..1868.12 rows=625 loops=1)
Index Cond: ("outer".jdoidx = ms.monitorx)
Filter: ((datex >= '2003-06-20 08:57:21.36'::timestamp without time
zone)
AND (datex <= '2003-06-29 08:57:21.36'::timestamp without time
zone))

Since the number of rows probably can't be reduced (as I read it, the query
actually returned that many rows), I'd think about clever joins in the FROM
part and fewer tables, to use fewer index scans.

Finally, decided to do an ad-hoc adjustment. Try this, or (wild guess) try
to completely eliminate the WHERE part by subselects on ms and monitorx.

This may be faster, slower, or even give different results, based on whether
I guessed the 1:N relationships right or not.

G.
------------------------------- cut here -------------------------------
select
ms.averageconnecttimex as ms_averageconnecttime,
ms.averagedurationx as ms_averageduration,
ms.datex as ms_date, ms.idx as ms_id,
ms.statusstringx as ms_statusstring, ms.statusx as ms_status,

msi.actualcontentx as msi_actualcontent, msi.connecttimex as
msi_connecttime, msi.correctcontentx as msi_correctcontent, msi.datex as
msi_date, msi.descriptionx as msi_description, msi.durationx as
msi_duration, msi.errorcontentx as msi_errorcontent, msi.idx as msi_id,
msi.monitorlocationx as msi_monitorlocation, msi.statusstringx as
msi_statusstring, msi.statusx as msi_status

from monitorstatusx ms
LEFT JOIN monitorx ON (monitorx.jdoidx = ms.monitorx)
LEFT JOIN monitorstatus_statusitemsx ms_si ON (ms.jdoidx =
ms_si.jdoidx)
LEFT JOIN monitorstatusitemlistd8ea58a5x msil ON
(ms_si.statusitemsx = msil.jdoidx)
LEFT JOIN monitorstatusitemx msi ON (msil.statusitemlistx = msi.jdoidx)
where
monitorx.idx = 'M-TEST_1444-TEST_00_10560561260561463219352'
ms.datex >= '2003-06-20 08:57:21.36' AND ms.datex <= '2003-06-29
08:57:21.36'
------------------------------- cut here -------------------------------

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message SZUCS Gábor 2003-06-25 14:20:32 Re: How to optimize monstrous query, sorts instead of
Previous Message Michael Mattox 2003-06-25 14:09:59 Re: How to optimize monstrous query, sorts instead of