Re: pg_plan_advice

From: Ajay Pal <ajay(dot)pal(dot)k(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, Lukas Fittl <lukas(at)fittl(dot)com>, Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>, Dian Fay <di(at)nmfay(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: pg_plan_advice
Date: 2026-01-27 07:49:38
Message-ID: CABRHmyvLPcx_K1T9Cwg4tFHiyh95fQU9tUhDwEFtvJiDsQDdaA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

with v12 patch, found below observations,

#1 Grouped Hash Join, This forces the join of dim1 and dim2 to happen
first, and then places that resulting set on the inner side of a Hash
Join against fact.
but the planner partially matches the generated advice.

CREATE TABLE fact (f_id int, d1_id int, d2_id int, d3_id int);
CREATE TABLE dim1 (id int PRIMARY KEY, val text);
CREATE TABLE dim2 (id int PRIMARY KEY, val text);
CREATE TABLE dim3 (id int PRIMARY KEY, val text);

INSERT INTO fact SELECT g, g%10, g%10, g%10 FROM generate_series(1, 10000) g;
INSERT INTO dim1 SELECT g, 'd1-'||g FROM generate_series(0, 9) g;
INSERT INTO dim2 SELECT g, 'd2-'||g FROM generate_series(0, 9) g;
INSERT INTO dim3 SELECT g, 'd3-'||g FROM generate_series(0, 9) g;
ANALYZE fact, dim1, dim2, dim3;

-- We want (dim1 JOIN dim2) to be the inner side of a Hash Join
SET LOCAL pg_plan_advice.advice = 'HASH_JOIN((dim1 dim2))';

postgres=*# EXPLAIN (COSTS OFF, PLAN_ADVICE)
SELECT * FROM fact
JOIN dim1 ON fact.d1_id = dim1.id
JOIN dim2 ON fact.d2_id = dim2.id;
QUERY PLAN
-----------------------------------------------------------
Nested Loop
Disabled: true
-> Nested Loop
Disabled: true
-> Seq Scan on fact
-> Index Scan using dim1_pkey on dim1
Index Cond: (id = fact.d1_id)
-> Index Scan using dim2_pkey on dim2
Index Cond: (id = fact.d2_id)
Supplied Plan Advice:
HASH_JOIN((dim1 dim2)) /* partially matched */
Generated Plan Advice:
JOIN_ORDER(fact dim1 dim2)
NESTED_LOOP_PLAIN(dim1 dim2)
SEQ_SCAN(fact)
INDEX_SCAN(dim1 public.dim1_pkey dim2 public.dim2_pkey)
NO_GATHER(fact dim1 dim2)
(17 rows)

#2 Multiple Instances of Same Table in Subqueries, here target the
second instance of dim1 inside the subquery 'sq'. both seq_scan and
index_scan advices are not matching.

SET LOCAL pg_plan_advice.advice = 'SEQ_SCAN(dim1#2(at)sq)
INDEX_SCAN(dim1(at)sq dim1_pkey)';

postgres=*# EXPLAIN (COSTS OFF, PLAN_ADVICE)
SELECT * FROM fact
JOIN (
SELECT a.id FROM dim1 a
JOIN dim1 b ON a.id = b.id
OFFSET 0
) sq ON fact.d1_id = sq.id;
QUERY PLAN
---------------------------------------------------
Hash Join
Hash Cond: (fact.d1_id = b.id)
-> Seq Scan on fact
-> Hash
-> Seq Scan on dim1 b
Supplied Plan Advice:
SEQ_SCAN(dim1#2(at)sq) /* not matched */
INDEX_SCAN(dim1(at)sq dim1_pkey) /* not matched */
Generated Plan Advice:
JOIN_ORDER(fact sq)
HASH_JOIN(sq)
SEQ_SCAN(b(at)sq fact)
NO_GATHER(fact b(at)sq)
(13 rows)

Thanks
Ajay

On Mon, Jan 26, 2026 at 9:38 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> Here is v12.
>
> The big change in this version is that I've added extensive SGML
> documentation for v0005. If the README was a little too low-level for
> you, this might work better. If you'd like to view it without
> downloading the patch set, I've put it up here:
>
> https://robertmhaas.github.io/postgresql-static/html-pgpa-v12/pgplanadvice.html
>
> Aside from that:
>
> * Added a new GUC pg_plan_advice.always_store_advice_details. Without
> that, you can't generate advice or see feedback on supplied advice
> when using prepared queries, because we don't know at plan time that
> it's right to incur the overhead of generating that stuff, and most of
> the time it won't be.
> * Revoked privileges on pg_clear_collected_shared_advice() as I had
> already done on pg_get_collected_shared_advice().
> * Removed a bogus elog(ERROR) in pgpa_walker_would_advise() in favor
> of returning 0. I think somebody, likely Jakub, pointed this out
> earlier, but I didn't quite absorb what I was being told until I
> rediscovered the problem.
> * Added a bunch more tests. I think the test coverage is getting
> pretty decent now, but it could still use some tests targeting more
> complex scenarios and corner cases. If you are curious about the
> coverage report, see here:
>
> https://robertmhaas.github.io/postgresql-static/coveragereport-pgpa-v12/contrib/pg_plan_advice/index.html
>
> The low number for pgpa_scanner.l is basically bogus, but I don't know
> of a way to make it not bogus. The low number for pgpa_ast.c is due to
> a bunch of things related to bitmap scans not being right, which at
> this point is, I think, the largest outstanding issue with the patch.
> It's probably more interesting to look into ways of covering a few
> more lines from pgpa_planner.c and pgpa_walker.c, which is where a lot
> of the complexity in this code lives. Also, it would be nice to have
> coverage of foreign scan cases, but I'm not quite sure what I need to
> do to create tests for this module that also depend on postgres_fdw.
> Any tips appreciated.
>
> --
> Robert Haas
> EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chao Li 2026-01-27 07:59:01 Re: tablecmds: fix bug where index rebuild loses replica identity on partitions
Previous Message Michael Paquier 2026-01-27 07:48:27 Re: tablecmds: reject CLUSTER ON for partitioned tables earlier