Quick Links

Re: TB-sized databases

From:	Simon Riggs <simon(at)2ndquadrant(dot)com>
To:	Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
Cc:	pgsql-performance(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Gregory Stark <stark(at)enterprisedb(dot)com>, Csaba Nagy <nagy(at)ecircle-ag(dot)com>, Bill Moran <wmoran(at)collaborativefusion(dot)com>
Subject:	Re: TB-sized databases
Date:	2007-12-06 09:38:16
Message-ID:	1196933896.4255.340.camel@ebony.site
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Robert,

On Wed, 2007-12-05 at 15:07 -0500, Robert Treat wrote:

> If the whole performance of your system depends upon indexed access, then
> maybe you need a database that gives you a way to force index access at the
> query level?

That sounds like a request for hints, which is OT here, ISTM.

The issue is that if somebody issues a "large query" then it will be a
problem whichever plan the query takes. Forcing index scans can make a
plan more expensive than a seq scan in many cases.

> > e.g. An 80GB table has 8 million blocks in it.
> > - So putting a statement_cost limit = 1 million would allow some fairly
> > large queries but prevent anything that did a SeqScan (or worse).
> > - Setting it 10 million is going to prevent things like sorting the
> > whole table without a LIMIT
> > - Setting it at 100 million is going to prevent unconstrained product
> > joins etc..
>
> I think you're completly overlooking the effect of disk latency has on query
> times. We run queries all the time that can vary from 4 hours to 12 hours in
> time based solely on the amount of concurrent load on the system, even though
> they always plan with the same cost.

Not at all. If we had statement_cost_limit then it would be applied
after planning and before execution begins. The limit would be based
upon the planner's estimate, not the likely actual execution time.

So yes a query may vary in execution time by a large factor as you
suggest, and it would be difficult to set the proposed parameter
accurately. However, the same is also true of statement_timeout, which
we currently support, so I don't see this point as an blocker.

Which leaves us at the burning question: Would you use such a facility,
or would the difficulty in setting it exactly prevent you from using it
for real?

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

In response to

Re: TB-sized databases at 2007-12-05 20:07:21 from Robert Treat

Responses

Re: TB-sized databases at 2007-12-06 15:42:14 from Michael Stone
Re: TB-sized databases at 2007-12-07 17:45:08 from Robert Treat

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Michael Stone	2007-12-06 15:42:14	Re: TB-sized databases
Previous Message	Dave Page	2007-12-06 08:50:56	Re: Evaluation of PG performance vs MSDE/MSSQL 2000 (not 2005)