Re: Mini improvement: statement_cost_limit

From: Greg Smith <gsmith(at)gregsmith(dot)com>
To: daveg <daveg(at)sonic(dot)net>
Cc: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>, pgsql-hackers(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com>, Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: Mini improvement: statement_cost_limit
Date: 2008-08-04 19:09:34
Message-ID: Pine.GSO.4.64.0808041358470.29907@westnet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 4 Aug 2008, daveg wrote:

> On Sun, Aug 03, 2008 at 10:57:55PM -0400, Robert Treat wrote:
>> Not such a great argument. Cost models on development servers can and often
>> are quite different from those on production, so you might be putting an
>> artifical limit on top of your developers.
>
> We load the production dumps into our dev environment, which are the same
> hardware spec, so the costs should be identical.

Not identical, just close. ANALYZE samples data from your table randomly.
The statistics used to compute the costs will therefore be slightly
different on the two servers even if the data is the same. The problem of
discovering one plan on production and another on development is not quite
that easy to remove. Ultimately, if your developers aren't thorough
enough to do thinks like look at EXPLAIN plans enough to discover things
that are just bad, I just chuckle at your thinking that putting a single
limiter on their bad behavior will somehow magically make that better.

Anyway, if your production server is small enough that you can afford to
have another one just like it for the developers to work on, that's great.
Robert's point is that many installs don't work like that. The
development teams in lots of places only get a subset of the production
data because it's too large to deploy on anything but a big server, which
often is hard to cost justify buying just for development purposes.

I like the concept of a cost limit, but I'm a bit horrified by the thought
of it being exposed simply through the internal cost numbers because they
are so arbitrary. One of the endless projects I think about but never
start coding is to write something that measures the things the planner
cost constants estimate on a particular machine, so that all those numbers
actually can be tied to some real-world time measure. If you did that,
you'd actually have a shot at accomplishing the real goal here, making
statement_cost_limit cut off statements expected to take longer than
statement_timeout before they even get started.

--
* Greg Smith gsmith(at)gregsmith(dot)com http://www.gregsmith.com Baltimore, MD

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Hannu Krosing 2008-08-04 19:13:17 Re: PL/Python
Previous Message David Blewett 2008-08-04 19:02:15 Re: PL/Python