inputs into query planner costing

From: Mike Roest <mike(dot)roest(at)replicon(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: inputs into query planner costing
Date: 2015-03-31 15:31:29
Message-ID: CAE7Byhht+WY+QZEDMNGEzKxx8Dtg3xiv8yWHHjJdGpJgOmJ8Yw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi there,
I'm having an issue with query performance between 2 different pgsql
environments.

Ther first is our current production postgres server with is running 9.3.5
on Centos 5 x64. The second system is Amazon's RDS postgres as a service.
On our local DB server we have a query that executes in a reasonable amount
of time (600 msec). On RDS the query will run for more then 10 minutes on
a similarly CPU specced systems. I've been working through with Amazon
support and I'm looking for more suggestions on where to look (both for me
and to direct Amazon). The RDS system does use a network filesystem while
our production server is a local RAID10 array, I can see that effecting the
actual performance of the query but not the query planner costing (unless
there's an input to query planner costing that I can't find)

The Query plan costs generated by the 2 systems are vastly different, while
the plans themselves are basically identical other then the materialization
that RDS is doing (if I disable the materialization then they are almost
the same other then a seq scan/heap scan on one small <2000 row table).
All the tables in the query have been analyzed on each server without any
impact

Current Production
Explain:
http://explain.depesz.com/s/Tkyc
Explain Analyze
http://explain.depesz.com/s/UnQt

RDS: (with enable_material=off)
http://explain.depesz.com/s/vDiV

(with enable_material=on)
http://explain.depesz.com/s/HUjx

I have validated that all the query planning configuration variables on
this page
http://www.postgresql.org/docs/9.3/static/runtime-config-query.html are the
same between the 2 environments. If I modify the local production system
values for things like random_page_cost and seq_page_cost to absurd values
like 60000 I can get it to generate a similar planner cost. Similarly if I
lower the RDS values to absurdly low values like .00000001 I can get it to
generate a similarly costed plan (while still performing horridly).

I've reached the end of things I can think about (I'm also working on
rewriting the query but it's a generated query out of a infrastructure
component so it's not a simple change).

Just looking for any ideas on additional things to look into.

The query is available here:
https://www.dropbox.com/s/m31ct6k0mod0576/simplifiedquery.sql?dl=0

--
Data's inconvienient when people have opinions.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jacobo Vazquez 2015-03-31 16:40:40 Fwd: SSPI authentication ASC_REQ_REPLAY_DETECT flag
Previous Message Deven Phillips 2015-03-31 14:26:56 Re: Muti-table join and roll-up aggregate data into nested JSON?