Re: Why could GEQO produce plans with lower costs than the standard_join_search?

From: Donald Dong <xdong(at)csumb(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Why could GEQO produce plans with lower costs than the standard_join_search?
Date: 2019-05-23 17:05:29
Message-ID: B026DB12-961E-4529-9F08-B9C68826D9E5@csumb.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On May 23, 2019, at 9:02 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Donald Dong <xdong(at)csumb(dot)edu> writes:
>> On May 22, 2019, at 11:42 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> You're still asking us to answer hypothetical questions unsupported
>>> by evidence. In what case does that really happen?
>
>> I attached the query plan and debug_print_rel output for GEQO and
>> standard_join_search.
>
>> planstate->total_cost cheapest_total_path
>> GEQO 54190.13 54239.03
>> STD 54179.02 54273.73
>
>> Here I observe GEQO produces a lower
>> cheapest_total_path->total_cost, but its planstate->total_cost is higher
>> than what standard_join_search produces.
>
> Well,
>
> (1) the plan selected by GEQO is in fact more expensive than
> the one found by the standard search. Not by much --- as Andrew
> observes, this difference is less than what the planner considers
> "fuzzily the same" --- but nonetheless 54190.13 > 54179.02.
>
> (2) the paths you show do not correspond to the finally selected
> plans --- they aren't even the same shape. (The Gathers are in
> different places, to start with.) I'm not sure where you were
> capturing the path data, but it looks like you missed top-level
> parallel-aggregation planning, and that managed to find some
> plans that were marginally cheaper than the ones you captured.
> Keep in mind that GEQO only considers join planning, not
> grouping/aggregation.
>
> Andrew's point about fuzzy cost comparison is also a good one,
> though we needn't invoke it to explain these particular numbers.

Oh, that's very good to know! I captured the path at the end of the
join_search_hook. If I understood correctly, top-level
parallel-aggregation will be applied later, so GEQO is not taking it
into consideration during the join searching?

By looking at the captured costs, I thought GEQO found a better join
order than the standard_join_search. However, the final plan using
the join order produced by GEQO turns out to be more expansive. Would
that imply if GEQO sees a join order which is identical to the one
produced by standard_join_search, it will discard it since the
cheapest_total_path has a higher cost, though the final plan may be
cheaper?

Here is another query (JOB/27a.sql) which has more significant cost
differences:

planstate->total_cost cheapest_total_path
GEQO 343016.77 343016.75
STD 342179.13 344137.33

Regards,
Donald Dong

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2019-05-23 17:11:34 Re: FullTransactionId changes are causing portability issues
Previous Message Magnus Hagander 2019-05-23 16:56:49 Re: initdb recommendations