Re: TB-sized databases

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gregory Stark <stark(at)enterprisedb(dot)com>, Csaba Nagy <nagy(at)ecircle-ag(dot)com>, Bill Moran <wmoran(at)collaborativefusion(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: TB-sized databases
Date: 2007-11-29 16:42:53
Message-ID: 20071129164253.GK5031@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

* Simon Riggs (simon(at)2ndquadrant(dot)com) wrote:
> On Thu, 2007-11-29 at 10:45 -0500, Tom Lane wrote:
> > Given that this list spends all day every day discussing cases where the
> > planner is wrong, I'd have to think that that's a bet I wouldn't take.
>
> I think you have a point, but the alternative is often much worse.

I'm not convinced you've outlined the consequences of implementing a
plan cost limit sufficiently.

> If an SQL statement fails because of too high cost, we can investigate
> the problem and re-submit. If a website slows down because somebody
> allowed a very large query to execute then everybody is affected, not
> just the person who ran the bad query. Either way the guy that ran the
> query loses, but without constraints in place one guy can kill everybody
> else also.

It's entirely possible (likely even) that most of the users accessing a
webpage are using the same queries and the same tables. If the estimates
for those tables ends up changing enough that PG adjusts the plan cost to
be above the plan cost limit then *all* of the users would be affected.

The plan cost isn't going to change for just one user if it's the same
query that a bunch of users are using. I'm not sure if handling the
true 'rougue query' case with this limit would actually be a net
improvment overall in a website-based situation.

I could see it being useful to set a 'notice_on_high_cost_query'
variable where someone working in a data warehouse situation would get a
notice if the query he's hand-crafting has a very high cost (in which
case he could ctrl-c it if he thinks something is wrong, rather than
waiting 5 hours before realizing he forgot a join clause), but the
website with the one rougue query run by one user seems a stretch.

Thanks,

Stephen

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Csaba Nagy 2007-11-29 16:54:32 Re: TB-sized databases
Previous Message Brad Nicholson 2007-11-29 16:18:39 Re: 7.4 Checkpoint Question