Re: slower merge join on sorted data chosen over

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject: Re: slower merge join on sorted data chosen over
Date: 2005-10-26 21:06:35
Message-ID: 20051026210635.GG16682@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 17, 2005 at 09:30:24PM +0100, Simon Riggs wrote:
> On Mon, 2005-10-17 at 14:55 -0500, Jim C. Nasby wrote:
> > On Tue, Oct 11, 2005 at 10:58:58AM +0100, Simon Riggs wrote:
> > > On Mon, 2005-10-10 at 15:14 -0500, Kevin Grittner wrote:
> > > > We are looking at doing much more with PostgreSQL over the
> > > > next two years, and it seems likely that this issue will come up
> > > > again where it is more of a problem. It sounded like there was
> > > > some agreement on HOW this was to be fixed, yet I don't see
> > > > any mention of doing it in the TODO list.
> > >
> > > > Is there any sort of
> > > > estimate for how much programming work would be involved?
> > >
> > > The main work here is actually performance testing, not programming. The
> > > cost model is built around an understanding of the timings and costs
> > > involved in the execution.
> > >
> > > Once we have timings to cover a sufficiently large range of cases, we
> > > can derive the cost model. Once derived, we can program it. Discussing
> > > improvements to the cost model without test results is never likely to
> > > convince people. Everybody knows the cost models can be improved, the
> > > only question is in what cases? and in what ways?
> > >
> > > So deriving the cost model needs lots of trustworthy test results that
> > > can be assessed and discussed, so we know how to improve things. [...and
> > > I don't mean 5 minutes with pg_bench...]
>
> ...
>
> > DBT seems to be a reasonable test database
>
> I was discussing finding the cost equations to use within the optimizer
> based upon a series of exploratory tests using varying data. That is
> different to using the same database with varying parameters. Both sound
> interesting, but it is the former that, IMHO, would be the more
> important.

True, although that doesn't necessarily mean you can't use the same data
generation. For the testing I was doing before I was just varying
correlation using cluster (or selecting from different fields with
different correlations).
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Martijn van Oosterhout 2005-10-26 21:14:17 Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)", File: "nbtsearch.c", Line: 89)
Previous Message Tom Lane 2005-10-26 21:06:02 Re: TRAP: FailedAssertion("!((itemid)->lp_flags & 0x01)", File: "nbtsearch.c", Line: 89)