Quick Links

Re: [HACKERS] Optimizer speed and GEQO (was: nested loops in joins)

From:	Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To:	tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane)
Cc:	lockhart(at)alumni(dot)caltech(dot)edu, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: [HACKERS] Optimizer speed and GEQO (was: nested loops in joins)
Date:	1999-02-06 17:51:52
Message-ID:	199902061751.MAA03085@candle.pha.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> >> Why are we maintaining this huge Path List for every RelOptInfo
> >> structure if we only need the cheapest?
>
> I think Bruce is onto something here... keep only the best-so-far
> not everything ever generated. That'd get rid of the O(N^2)
> comparison behavior.
>
> The only thing I can think of that we might possibly *need* the
> whole list for is if it is used as a recursion stopper.
> ("Oh, I already investigated that alternative, no need to go down
> that path again.") It did not look to me like the optimizer does
> such a thing, but I don't claim to understand the code.
>
> It seems to me that the search space of possible paths is
> well-structured and can be enumerated without generation of duplicates.
> You've got a known set of tables involved in a query, a fixed set of
> possible access methods for each table, and only so many ways to
> combine them. So the real question here is why does the optimizer
> even need to check for duplicates --- should it not never generate
> any to begin with? The profiles I ran a few days ago show that indeed
> most of the generated paths are not duplicates, but a small fraction
> are duplicates. Why is that? I'd feel a lot closer to understanding
> what's going on if we could explain where the duplicates come from.

Here is my guess, and I think it is valid. There are cases where a join
of two tables may be more expensive than another plan, but the order of
the result may be cheaper to use for a later join than the cheaper plan.
That is why I suspect they are doing in better_path():

if (samekeys(path->keys, new_path->keys) &&
equal_path_ordering(&path->p_ordering,
&new_path->p_ordering))
{
old_path = path;
break;
}

So I don't think we can grab the cheapest path right away, but I think
the system is keeping around non-cheapest paths with the same result
ordering. I also would like to throw away plans that are so expensive
that sorting another plan to the desired ordering would be cheaper than
using the plan.

I am only guessing on this. How often does this 'keep non-cheapest path
around' really help?

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Re: [HACKERS] Optimizer speed and GEQO (was: nested loops in joins) at 1999-02-06 17:06:54 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	1999-02-06 20:41:49	Re: [HACKERS] Re: SELECT INTO TABLE busted?
Previous Message	Jan Wieck	1999-02-06 17:28:16	Re: [HACKERS] strange behaviour on pooled alloc