Skip site navigation (1) Skip section navigation (2)

Re: Tuning planner cost estimates

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Tuning planner cost estimates
Date: 2005-05-19 16:31:47
Message-ID: 200505190931.47662.josh@agliodbs.com (view raw or flat)
Thread:
Lists: pgsql-performance
Jim,

> I've been doing some work to try and identify the actual costs
> associated with an index scan with some limited sucess. What's been run
> so far can be seen at http://stats.distributed.net/~decibel. But there's
> a couple problems. First, I can't use the box exclusively for this
> testing, which results in some result inconsistencies.

I can get you access to boxes.  Chat on IRC?

> Second, I've been 
> using a dataset that I can't make public, which means no one else can
> run these tests on different hardware.

Then use one of the DBT databases.

> In the
> case of testing index scans, we need to be able to vary correlation,
> which so far I've been doing by ordering by different columns. I suspect
> it will also be important to test with different tuple sizes. There's
> also the question of whether or not the cache should be flushed for each
> run or not.
>
> Does this sound like a good way to determine actual costs for index
> scans (and hopefully other access methods in the future)? If so, what
> would be a good way to implement this?

Well, the problem is that what we need to index scans is a formula, rather 
than a graph.   The usefulness of benchmarking index scan cost is so that we 
can test our formula for accuracy and precision.  However, such a formula 
*does* need to take into account concurrent activity, updates, etc ... that 
is, it needs to approximately estimate the relative cost on a live database, 
not a test one.

This is also going to be a moving target because Tom's in-memory-bitmapping 
changes relative cost equations.

I think a first step would be, in fact, to develop a tool that allows us to 
put EXPLAIN ANALYZE results in a database table.  Without that, there is no 
possibility of statistical-scale analysis.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

In response to

Responses

pgsql-performance by date

Next:From: Donald CourtneyDate: 2005-05-19 16:54:03
Subject: Re: PostgreSQL strugling during high load
Previous:From: Vivek KheraDate: 2005-05-18 19:08:17
Subject: Re: [pgsql-benchmarks] Error when try installing pgbench ?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group