Re: inputs into query planner costing

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mike Roest <mike(dot)roest(at)replicon(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: inputs into query planner costing
Date: 2015-03-31 18:16:09
Message-ID: 22734.1427825769@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Mike Roest <mike(dot)roest(at)replicon(dot)com> writes:
> I'm having an issue with query performance between 2 different pgsql
> environments.

> Ther first is our current production postgres server with is running 9.3.5
> on Centos 5 x64. The second system is Amazon's RDS postgres as a service.
> On our local DB server we have a query that executes in a reasonable amount
> of time (600 msec). On RDS the query will run for more then 10 minutes on
> a similarly CPU specced systems. I've been working through with Amazon
> support and I'm looking for more suggestions on where to look (both for me
> and to direct Amazon). The RDS system does use a network filesystem while
> our production server is a local RAID10 array, I can see that effecting the
> actual performance of the query but not the query planner costing (unless
> there's an input to query planner costing that I can't find)

> The Query plan costs generated by the 2 systems are vastly different, while
> the plans themselves are basically identical other then the materialization
> that RDS is doing (if I disable the materialization then they are almost
> the same other then a seq scan/heap scan on one small <2000 row table).
> All the tables in the query have been analyzed on each server without any
> impact

Last I checked, there was not any magic pixie dust in the planner ;-).
Your results have to be explained by one or more of these things:

1. Not same version of Postgres between the two systems.

2. Not same planner parameter settings.

3. Different physical table sizes.

4. Different ANALYZE statistics.

As for #1, I have no idea whether Amazon RDS runs a purely stock Postgres
release or has some custom modifications of their own, but it'd be worth
asking about that.

As for #2, you say you checked that, but I'm dubious. In particular this
discrepancy:

Index Scan using uix2pdas_userpolicy on policydataaccessscope policydataaccessscope31 (cost=0.28..8.30 rows=1 width=16)

Index Scan using uix2pdas_userpolicy on policydataaccessscope policydataaccessscope31 (cost=0.28..4.30 rows=1 width=16)

is hard to explain unless the second system is using a smaller
random_page_cost than the first. Maybe somebody used ALTER ROLE SET
or ALTER DATABASE SET to adjust parameters in a way that only affects
some roles/databases?

I suspect that the large differences in some of the seqscan costs might be
explainable by #3, ie those tables are bloated with lots of empty space on
one system but not the other. Comparing pg_relation_size() would be the
way to find out.

I mention #4 for completeness. ANALYZE uses random sampling, so it's
expectable that the data distribution stats would be a bit different on
the two systems, but large differences that persist across multiple
ANALYZE attempts are unlikely. (Although ... you do have the
same default_statistics_target on both systems, no? Table-specific
statistics targets could be a gotcha as well.)

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message zach cruise 2015-03-31 18:56:04 Re: how would you speed up this long query?
Previous Message Andy Colson 2015-03-31 17:56:49 Re: inputs into query planner costing