Re: slower merge join on sorted data chosen over

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: slower merge join on sorted data chosen over
Date: 2005-10-17 20:30:24
Message-ID: 1129581024.8300.712.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2005-10-17 at 14:55 -0500, Jim C. Nasby wrote:
> On Tue, Oct 11, 2005 at 10:58:58AM +0100, Simon Riggs wrote:
> > On Mon, 2005-10-10 at 15:14 -0500, Kevin Grittner wrote:
> > > We are looking at doing much more with PostgreSQL over the
> > > next two years, and it seems likely that this issue will come up
> > > again where it is more of a problem. It sounded like there was
> > > some agreement on HOW this was to be fixed, yet I don't see
> > > any mention of doing it in the TODO list.
> >
> > > Is there any sort of
> > > estimate for how much programming work would be involved?
> >
> > The main work here is actually performance testing, not programming. The
> > cost model is built around an understanding of the timings and costs
> > involved in the execution.
> >
> > Once we have timings to cover a sufficiently large range of cases, we
> > can derive the cost model. Once derived, we can program it. Discussing
> > improvements to the cost model without test results is never likely to
> > convince people. Everybody knows the cost models can be improved, the
> > only question is in what cases? and in what ways?
> >
> > So deriving the cost model needs lots of trustworthy test results that
> > can be assessed and discussed, so we know how to improve things. [...and
> > I don't mean 5 minutes with pg_bench...]

...

> DBT seems to be a reasonable test database

I was discussing finding the cost equations to use within the optimizer
based upon a series of exploratory tests using varying data. That is
different to using the same database with varying parameters. Both sound
interesting, but it is the former that, IMHO, would be the more
important.

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim C. Nasby 2005-10-17 21:12:06 Re: A costing analysis tool
Previous Message Jim C. Nasby 2005-10-17 19:55:03 Re: slower merge join on sorted data chosen over