Re: Why could GEQO produce plans with lower costs than the standard_join_search?

From: "Finnerty, Jim" <jfinnert(at)amazon(dot)com>
To: Donald Dong <xdong(at)csumb(dot)edu>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Why could GEQO produce plans with lower costs than the standard_join_search?
Date: 2019-05-22 21:03:08
Message-ID: C0A4FF4A-76FA-4EFB-B9A7-FE5C84AAD3F1@amazon.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Fwiw, I had an intern do some testing on the JOB last year, and he reported that geqo sometimes produced plans of lower cost than the standard planner (we were on PG10 at the time). I filed it under "unexplained things that we need to investigate when we have time", but alas...

In any case, Donald isn't the only one who has noticed this behavior.

On 5/22/19, 3:54 PM, "Donald Dong" <xdong(at)csumb(dot)edu> wrote:

On May 22, 2019, at 11:42 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Donald Dong <xdong(at)csumb(dot)edu> writes:
>> I find the cost from cheapest_total_path->total_cost is different
>> from the cost from queryDesc->planstate->total_cost. What I saw was
>> that GEQO tends to form paths with lower
>> cheapest_total_path->total_cost (aka the fitness of the children).
>> However, standard_join_search is more likely to produce a lower
>> queryDesc->planstate->total_cost, which is the cost we get using
>> explain.
>
>> I wonder why those two total costs are different? If the total_cost
>> from the planstate is more accurate, could we use that instead as the
>> fitness in geqo_eval?
>
> You're still asking us to answer hypothetical questions unsupported
> by evidence. In what case does that really happen?

Hi,

My apologies if this is not the minimal necessary set up. But here's
more information about what I saw using the following query
(JOB/1a.sql):

SELECT MIN(mc.note) AS production_note,
MIN(t.title) AS movie_title,
MIN(t.production_year) AS movie_year
FROM company_type AS ct,
info_type AS it,
movie_companies AS mc,
movie_info_idx AS mi_idx,
title AS t
WHERE ct.kind = 'production companies'
AND it.info = 'top 250 rank'
AND mc.note NOT LIKE '%(as Metro-Goldwyn-Mayer Pictures)%'
AND (mc.note LIKE '%(co-production)%'
OR mc.note LIKE '%(presents)%')
AND ct.id = mc.company_type_id
AND t.id = mc.movie_id
AND t.id = mi_idx.movie_id
AND mc.movie_id = mi_idx.movie_id
AND it.id = mi_idx.info_type_id;

I attached the query plan and debug_print_rel output for GEQO and
standard_join_search.

planstate->total_cost cheapest_total_path
GEQO 54190.13 54239.03
STD 54179.02 54273.73

Here I observe GEQO produces a lower
cheapest_total_path->total_cost, but its planstate->total_cost is higher
than what standard_join_search produces.

Regards,
Donald Dong

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2019-05-22 21:07:06 Re: Is it safe to ignore the return value of SPI_finish and SPI_execute?
Previous Message Tom Lane 2019-05-22 20:52:34 Re: Is it safe to ignore the return value of SPI_finish and SPI_execute?