Re: Proposal: add a debug message about using geqo

From: KAWAMOTO Masaya <kawamoto(at)sraoss(dot)co(dot)jp>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: add a debug message about using geqo
Date: 2022-06-02 06:09:39
Message-ID: 20220602150939.dddf0de31c2d9eca1e81a3fa@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 10 May 2022 18:49:54 +0530
Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:

> If we add that information to EXPLAIN output, the user won't need
> access to server logs.
>
> May be we need it in both the places.

That sounds a nice idea. But I don't think that postgres shows in the
EXPLAIN output why the plan is selected. Would it be appropriate to
show that GEQO is used in EXPLAIN output?

As a test, I created a patch that add information about GEQO to
EXPLAIN output by the GEQO option. The output example is as follows.
What do you think about the location and content of information about GEQO?

postgres=# explain (geqo) select o.id, o.date, c.name as customer_name, bar.amount as total_amount
from orders o join customer c on o.customer_id = c.id
join (select foo.id as id, sum(foo.amount) as amount
from (select od.order_id as id, p.name as name, od.quantity as quantity, (p.price * od.quantity) as amount
from order_detail od join product p on od.product_id = p.id
) as foo
group by id) as bar on o.id = bar.id ;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Hash Join (cost=118.75..155.04 rows=200 width=48)
Hash Cond: (o.customer_id = c.id)
-> Hash Join (cost=94.58..130.34 rows=200 width=20)
Hash Cond: (o.id = bar.id)
-> Seq Scan on orders o (cost=0.00..30.40 rows=2040 width=12)
-> Hash (cost=92.08..92.08 rows=200 width=12)
-> Subquery Scan on bar (cost=88.08..92.08 rows=200 width=12)
-> HashAggregate (cost=88.08..90.08 rows=200 width=12)
Group Key: od.order_id
-> Hash Join (cost=37.00..72.78 rows=2040 width=12)
Hash Cond: (od.product_id = p.id)
-> Seq Scan on order_detail od (cost=0.00..30.40 rows=2040 width=12)
-> Hash (cost=22.00..22.00 rows=1200 width=8)
-> Seq Scan on product p (cost=0.00..22.00 rows=1200 width=8)
-> Hash (cost=16.30..16.30 rows=630 width=36)
-> Seq Scan on customer c (cost=0.00..16.30 rows=630 width=36)
GeqoDetails: GEQO: used, geqo_threshold: 3, Max join nodes: 3
(17 rows)

postgres=# set geqo_threshold to 16;
SET
postgres=# explain (geqo) select ... ;
QUERY PLAN
--------------------------------------------------------------------------------------------------------
Hash Join (cost=118.75..155.04 rows=200 width=48)
Hash Cond: (o.customer_id = c.id)
-> Hash Join (cost=94.58..130.34 rows=200 width=20)
Hash Cond: (o.id = bar.id)
-> Seq Scan on orders o (cost=0.00..30.40 rows=2040 width=12)
-> Hash (cost=92.08..92.08 rows=200 width=12)
-> Subquery Scan on bar (cost=88.08..92.08 rows=200 width=12)
-> HashAggregate (cost=88.08..90.08 rows=200 width=12)
Group Key: od.order_id
-> Hash Join (cost=37.00..72.78 rows=2040 width=12)
Hash Cond: (od.product_id = p.id)
-> Seq Scan on order_detail od (cost=0.00..30.40 rows=2040 width=12)
-> Hash (cost=22.00..22.00 rows=1200 width=8)
-> Seq Scan on product p (cost=0.00..22.00 rows=1200 width=8)
-> Hash (cost=16.30..16.30 rows=630 width=36)
-> Seq Scan on customer c (cost=0.00..16.30 rows=630 width=36)
GeqoDetails: GEQO: not used, geqo_threshold: 16, Max join nodes: 3
(17 rows)

postgres=# explain (analyze, settings, geqo) select ...;
QUERY PLAN

------------------------------------------------------------------------------------------------------------------------
--------------------------
Hash Join (cost=118.75..155.04 rows=200 width=48) (actual time=0.104..0.113 rows=3 loops=1)
Hash Cond: (o.customer_id = c.id)
-> Hash Join (cost=94.58..130.34 rows=200 width=20) (actual time=0.042..0.048 rows=3 loops=1)
Hash Cond: (o.id = bar.id)
-> Seq Scan on orders o (cost=0.00..30.40 rows=2040 width=12) (actual time=0.003..0.005 rows=3 loops=1)
-> Hash (cost=92.08..92.08 rows=200 width=12) (actual time=0.034..0.037 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Subquery Scan on bar (cost=88.08..92.08 rows=200 width=12) (actual time=0.031..0.035 rows=3 loops=1)
-> HashAggregate (cost=88.08..90.08 rows=200 width=12) (actual time=0.030..0.033 rows=3 loops=1)
Group Key: od.order_id
Batches: 1 Memory Usage: 56kB
-> Hash Join (cost=37.00..72.78 rows=2040 width=12) (actual time=0.016..0.023 rows=7 loops=
1)
Hash Cond: (od.product_id = p.id)
-> Seq Scan on order_detail od (cost=0.00..30.40 rows=2040 width=12) (actual time=0.0
03..0.004 rows=7 loops=1)
-> Hash (cost=22.00..22.00 rows=1200 width=8) (actual time=0.007..0.008 rows=4 loops=
1)
Buckets: 2048 Batches: 1 Memory Usage: 17kB
-> Seq Scan on product p (cost=0.00..22.00 rows=1200 width=8) (actual time=0.00
4..0.006 rows=4 loops=1)
-> Hash (cost=16.30..16.30 rows=630 width=36) (actual time=0.019..0.020 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on customer c (cost=0.00..16.30 rows=630 width=36) (actual time=0.014..0.016 rows=3 loops=1)
Settings: geqo_threshold = '16'
GeqoDetails: GEQO: not used, geqo_threshold: 16, Max join nodes: 3
Planning Time: 0.516 ms
Execution Time: 0.190 ms
(24 rows)

> On Tue, May 10, 2022 at 6:35 AM KAWAMOTO Masaya <kawamoto(at)sraoss(dot)co(dot)jp> wrote:
> >
> > Hi,
> >
> > During query tuning, users may want to check if GEQO is used or not
> > to generate a plan. However, users can not know it by simply counting
> > the number of tables that appear in SQL. I know we can know it by
> > enabling GEQO_DEBUG flag, but it needs recompiling, so I think it is
> > inconvenient.
> >
> > So, I would like to propose to add a debug level message that shows
> > when PostgreSQL use GEQO. That enables users to easily see it by
> > just changing log_min_messages.
> >
> > Use cases are as follows:
> > - When investigating about the result of planning, user can determine
> > whether the plan is chosen by the standard planning or GEQO.
> >
> > - When tuning PostgreSQL, user can determine the suitable value of
> > geqo_threshold parameter.
> >
> > Best regards.
> >
> > --
> > KAWAMOTO Masaya <kawamoto(at)sraoss(dot)co(dot)jp>
> > SRA OSS, Inc. Japan
>
>
>
> --
> Best Wishes,
> Ashutosh Bapat

--
KAWAMOTO Masaya <kawamoto(at)sraoss(dot)co(dot)jp>
SRA OSS, Inc. Japan

Attachment Content-Type Size
add_geqo_info_to_explain.patch application/octet-stream 6.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Will Mortensen 2022-06-02 06:34:19 [PATCH] fix doc example of bit-reversed MAC address
Previous Message Amit Kapila 2022-06-02 06:08:34 Re: Multi-Master Logical Replication