Re: anti-join chosen even when slower than old plan

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Cédric Villemain <cedric(dot)villemain(dot)debian(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Mladen Gogala <mladen(dot)gogala(at)vmsinfo(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: anti-join chosen even when slower than old plan
Date: 2010-11-12 18:57:38
Message-ID: AANLkTikJWuGy2ejHSFygF-HTzkOewefLx=ewDL1V2B-o@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Nov 12, 2010 at 11:43 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> I think his point is that we already have a proven formula
> (Mackert-Lohmann) and shouldn't be inventing a new one out of thin air.
> The problem is to figure out what numbers to apply the M-L formula to.

I'm not sure that's really measuring the same thing, although I'm not
opposed to using it if it produces reasonable answers.

> I've been thinking that we ought to try to use it in the context of the
> query as a whole rather than for individual table scans; the current
> usage already has some of that flavor but we haven't taken it to the
> logical conclusion.

That's got a pretty severe chicken-and-egg problem though, doesn't it?
You're going to need to know how much data you're touching to
estimate the costs so you can pick the best plan, but you can't know
how much data will ultimately be touched until you've got the whole
plan.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Robert Haas 2010-11-12 21:07:47 Re: questions regarding shared_buffers behavior
Previous Message Scott Carey 2010-11-12 18:19:59 Re: MVCC performance issue