Poor query plan chosen in 9.0.3 vs 8.3.7

From: Brian Connolly <bconn(at)labkey(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Poor query plan chosen in 9.0.3 vs 8.3.7
Date: 2011-05-05 17:43:37
Message-ID: BANLkTin2HYy2m_Db354XMf=gDxxf7R1eCg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

We have a generated query from our web application which takes far longer to
complete in 9.0.4, than in 8.3.7 (>60sec in 9.0.4 ~10sec in 8.3.7)

The query plan generated in 9.0, includes a Materialize step which takes the
bulk of the time for the query. If I disable materialize (by running set
enable_material='off';) the query takes ~6 seconds to run (NOTE: the query
plan chosen in this case is very similar to the plan chosen in 8.3.7).
>From reading
http://rhaas.blogspot.com/2010/04/materialization-in-postgresql-90.html, I
realize that the planner will be more aggressive in choosing Materialize in
9.0. Is there a way to modify the planner cost settings to minimize it's
use in cases like this?

Below, I have included the query in question and the "explain analyze"
output for the case where enable_material is on. .

NOTE: I will send explain analyze output for when enable_material is 'off'
and information on the postgresql version, settings and server configuration
in a follow up email due the length of email restrictions on the mailing
list.

(A vacuum analyze was run prior to running the queries for this email)

Any help that you can provide would be greatly appreciated.

Query
------------------------------------------------------------------------------------------------
SELECT *
FROM (
SELECT
Peter_SizeByType.RowId AS RowId,
Peter_SizeByType.Name AS Name,
Peter_SizeByType.Type AS Type,
Peter_SizeByType.ZAve AS ZAve,
Peter_SizeByType.TimeLabel AS TimeLabel,
Peter_SizeByType.StorageTemperature AS StorageTemperature,
Peter_SizeByType.Pdl AS Pdl,
Peter_SizeByType.Cumulants AS Cumulants,
Peter_SizeByType.AnalysisTool AS AnalysisTool,
Peter_SizeByType.meanCountRate AS meanCountRate,
Peter_SizeByType.ExtractionNumber AS ExtractionNumber,
Peter_SizeByType.TestNumber AS TestNumber,
Peter_SizeByType.Sort AS Sort
FROM (SELECT PS_2.rowid AS RowId,
F_3.Name AS Name,
F_3.Type AS Type,
PS_2.zave AS ZAve,
PS_2.timelabel AS TimeLabel,
PS_2.storagetemperature AS StorageTemperature,
PS_2.pdl AS Pdl,
PS_2.cumulants AS Cumulants,
PS_2.analysistool AS AnalysisTool,
PS_2.meancountrate AS meanCountRate,
PS_2.extractionnumber AS ExtractionNumber,
PS_2.testnumber AS TestNumber,
T_4.sort AS Sort
FROM (SELECT c69d129_particle_size_result_fields_5.analysistool AS
analysistool, c69d129_particle_size_result_fields_5.cumulants AS cumulants,
c69d129_particle_size_result_fields_5.extractionnumber AS extractionnumber,
c69d129_particle_size_result_fields_5.pdl AS pdl,
c69d129_particle_size_result_fields_5.rowid AS rowid,
(SELECT RunId FROM exp.Data WHERE RowId =
c69d129_particle_size_result_fields_5.DataId)
AS Run, c69d129_particle_size_result_fields_5$Run$.container AS
Run_Folder, c69d129_particle_size_result_fields_5$Run$Folder$.entityid AS
Run_Folder_EntityId, c69d129_particle_size_result_fields_5$Run$.name AS
Run_Name, c69d129_particle_size_result_fields_5.storagetemperature AS
storagetemperature, c69d129_particle_size_result_fields_5.testnumber AS
testnumber, c69d129_particle_size_result_fields_5.timelabel AS timelabel,
c69d129_particle_size_result_fields_5.zave AS zave,
c69d129_particle_size_result_fields_5.meancountrate AS meancountrate,
c69d129_particle_size_result_fields_5.rowid AS rowid1
FROM (SELECT * FROM assayresult."c69d129_particle_size_result_fields"
WHERE (((SELECT Container FROM exp.Data WHERE RowId = DataId) IN
('d938da12-1b43-102d-a8a2-78911b79dd1c'))))
c69d129_particle_size_result_fields_5
LEFT OUTER JOIN (SELECT * FROM exp.experimentrun
WHERE (((protocollsid LIKE
'urn:lsid:labkey.com:Particle+SizeProtocol.Folder-69:Particle+Size')))
AND (container IN ('d938da12-1b43-102d-a8a2-78911b79dd1c')))
c69d129_particle_size_result_fields_5$Run$ ON ((SELECT RunId FROM exp.Data
WHERE RowId = c69d129_particle_size_result_fields_5.DataId) =
c69d129_particle_size_result_fields_5$Run$.rowid)
LEFT OUTER JOIN core.containers
c69d129_particle_size_result_fields_5$Run$Folder$ ON
(c69d129_particle_size_result_fields_5$Run$.container =
c69d129_particle_size_result_fields_5$Run$Folder$.entityid)) PS_2
INNER JOIN (SELECT Formulations_6.container AS Folder,
Formulations_6$Folder$.entityid AS Folder_EntityId, Formulations_6.name AS
Name, Formulations_6.rowid AS RowId,
CAST((SELECT StringValue FROM exp.ObjectProperty WHERE
exp.ObjectProperty.PropertyId = 560 AND exp.ObjectProperty.ObjectId =
Formulations_6$LSID$_C.ObjectId) AS VARCHAR(4000))
AS Type
FROM (SELECT * FROM exp.material
WHERE (container IN ('d938da12-1b43-102d-a8a2-78911b79dd1c')) AND
((cpastype = 'urn:lsid:labkey.com:SampleSet.Folder-69:Formulations')))
Formulations_6
LEFT OUTER JOIN core.containers Formulations_6$Folder$ ON
(Formulations_6.container = Formulations_6$Folder$.entityid)
LEFT OUTER JOIN exp.object Formulations_6$LSID$_C ON
(Formulations_6.lsid = Formulations_6$LSID$_C.objecturi) AND
Formulations_6.Container = 'd938da12-1b43-102d-a8a2-78911b79dd1c') F_3 ON
PS_2.Run_Name=F_3.Name || '.xls' AND
PS_2.Run_Folder_EntityId=F_3.Folder_EntityId
INNER JOIN (SELECT
CAST((SELECT FloatValue FROM exp.ObjectProperty WHERE
exp.ObjectProperty.PropertyId = 1288 AND exp.ObjectProperty.ObjectId =
Timepoints_7.objectid) AS INT)
AS sort, Timepoints_7.key AS time
FROM (SELECT * FROM exp.indexvarchar
WHERE ((listid = 17))) Timepoints_7) T_4 ON PS_2.timelabel=T_4.time)
Peter_SizeByType ) x
ORDER BY Name ASC
LIMIT 101

Explain Analyze output with enable_material='on'
------------------------------------------------------------------------------------------------
Limit (cost=233190.18..233190.43 rows=101 width=71) (actual
time=194078.460..194078.645 rows=101 loops=1)
-> Sort (cost=233190.18..233190.44 rows=104 width=71) (actual
time=194078.457..194078.520 rows=101 loops=1)
Sort Key: material.name
Sort Method: top-N heapsort Memory: 39kB
-> Nested Loop Left Join (cost=3.27..233186.69 rows=104 width=71)
(actual time=3996.558..193952.126 rows=67044 loops=1)
Join Filter: ((material.container)::text =
'd938da12-1b43-102d-a8a2-78911b79dd1c'::text)
-> Nested Loop (cost=3.27..232216.91 rows=104 width=155)
(actual time=3996.433..190230.691 rows=67044 loops=1)
Join Filter: ((experimentrun.name)::text = ((
material.name)::text || '.xls'::text))
-> Index Scan using idx_material_lsid on material
(cost=0.00..444.90 rows=774 width=96) (actual time=0.251..14.974 rows=1303
loops=1)
Filter: (((container)::text =
'd938da12-1b43-102d-a8a2-78911b79dd1c'::text) AND ((cpastype)::text =
'urn:lsid:labkey.com:SampleSet.Folder-69:Formulations'::text))
-> Materialize (cost=3.27..230512.55 rows=93
width=129) (actual time=0.023..70.942 rows=67044 loops=1303)
-> Nested Loop (cost=3.27..230512.08 rows=93
width=129) (actual time=0.065..3355.484 rows=67044 loops=1)
-> Nested Loop (cost=0.00..229803.99
rows=93 width=137) (actual time=0.048..2335.540 rows=67044 loops=1)
-> Nested Loop (cost=0.00..2.33
rows=1 width=74) (actual time=0.013..0.023 rows=1 loops=1)
-> Seq Scan on containers
"c69d129_particle_size_result_fields_5$run$folder$" (cost=0.00..1.16 rows=1
width=37) (actual time=0.006..0.009 rows=1 loops=1)
Filter: ((entityid)::text
= 'd938da12-1b43-102d-a8a2-78911b79dd1c'::text)
-> Seq Scan on containers
"formulations_6$folder$" (cost=0.00..1.16 rows=1 width=37) (actual
time=0.002..0.006 rows=1 loops=1)
Filter:
(("formulations_6$folder$".entityid)::text =
'd938da12-1b43-102d-a8a2-78911b79dd1c'::text)
-> Nested Loop
(cost=0.00..229800.73 rows=93 width=63) (actual time=0.032..2169.448
rows=67044 loops=1)
Join Filter:
((c69d129_particle_size_result_fields.timelabel)::text =
(indexvarchar.key)::text)
-> Seq Scan on
c69d129_particle_size_result_fields (cost=0.00..229742.02 rows=348
width=59) (actual time=0.016..527.225 rows=69654 loops=1)
Filter: (((SubPlan
3))::text = 'd938da12-1b43-102d-a8a2-78911b79dd1c'::text)
SubPlan 3
-> Index Scan using
pk_data on data (cost=0.00..3.27 rows=1 width=37) (actual time=0.003..0.004
rows=1 loops=69654)
Index Cond:
(rowid = $2)
-> Materialize
(cost=0.00..1.32 rows=11 width=10) (actual time=0.001..0.010 rows=11
loops=69654)
-> Seq Scan on
indexvarchar (cost=0.00..1.26 rows=11 width=10) (actual time=0.003..0.013
rows=11 loops=1)
Filter: (listid =
17)
-> Index Scan using pk_experimentrun on
experimentrun (cost=3.27..4.33 rows=1 width=74) (actual time=0.004..0.005
rows=1 loops=67044)
Index Cond: (experimentrun.rowid =
(SubPlan 4))
Filter:
(((experimentrun.protocollsid)::text ~~
'urn:lsid:labkey.com:Particle+SizeProtocol.Folder-69:Particle+Size'::text)
AND ((experimentrun.container)::text = 'd938da12-1b43-102d-a8a
2-78911b79dd1c'::text))
SubPlan 4
-> Index Scan using pk_data on
data (cost=0.00..3.27 rows=1 width=4) (actual time=0.003..0.004 rows=1
loops=67044)
Index Cond: (rowid = $2)
SubPlan 4
-> Index Scan using pk_data on
data (cost=0.00..3.27 rows=1 width=4) (actual time=0.003..0.004 rows=1
loops=67044)
Index Cond: (rowid = $2)
-> Index Scan using uq_object on object
"formulations_6$lsid$_c" (cost=0.00..2.69 rows=1 width=64) (actual
time=0.033..0.034 rows=1 loops=67044)
Index Cond: ((material.lsid)::text =
("formulations_6$lsid$_c".objecturi)::text)
SubPlan 1
-> Index Scan using pk_objectproperty on objectproperty
(cost=0.00..3.31 rows=1 width=13) (actual time=0.005..0.007 rows=1
loops=67044)
Index Cond: ((objectid = $0) AND (propertyid = 560))
SubPlan 2
-> Index Scan using pk_objectproperty on objectproperty
(cost=0.00..3.31 rows=1 width=8) (actual time=0.004..0.005 rows=1
loops=67044)
Index Cond: ((objectid = $1) AND (propertyid = 1288))
Total runtime: 194080.893 ms

Thank you,

Brian Connolly

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2011-05-05 17:53:16 Re: VX_CONCURRENT flag on vxfs( 5.1 or later) for performance for postgresql?
Previous Message Kevin Grittner 2011-05-05 17:16:20 Re: Explicit joins