| From: | Henson Choi <assam258(at)gmail(dot)com> |
|---|---|
| To: | Junwang Zhao <zhjwpku(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
| Subject: | Re: [SQL/PGQ] Early pruning for GRAPH_TABLE path generation |
| Date: | 2026-05-13 16:32:25 |
| Message-ID: | CAAAe_zBnjwgAcZLcsNpMok7WPhqKdRVHGLi4hfbVaDROugb_xg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Junwang,
Thanks for the quick turnaround on v2. Walked through it against the
prior notes -- everything reads well; one small thing on 0002 to fix
before the next spin, inline below.
> 1. A long chain pattern that survives pruning but used to enumerate
> > the full N^K product (the g5 chain from your benchmark is a
> > natural fit). Not currently covered in
> > src/test/regress/sql/graph_table.sql -- worth adding.
>
> Added as v2-0002.
>
The g5 chain is a good fit. One nit: the comment above the SELECT
has a U+2192 arrow in both the .sql and the .out -- unless a regress
case is specifically exercising encoding, these stay ASCII-only, so
please replace with a plain `->` in both files.
> I added the following paragraph to the commit message, feel free to
> recommend wording.
>
> The cyclic case where a closing edge has both endpoints already
> in the prefix is already exercised by the existing same-variable
> loop patterns in graph_table.sql (e.g. (a)-[b]->(a)-[b]->(a)),
> because repeated vertex names are merged into a single path factor
> before DFS. Likewise, the "all edge candidates pruned" path into
> generate_query_for_empty_path_pattern() is already hit by the
> MATCH (o IS orders)-[IS customer_orders]->(c IS customers) case,
> where no catalog edge matches the declared direction; pruning just
> makes those branches easier to reach. Neither case needs a
> dedicated new test beyond what is already there.
>
Good.
> >
> > ## Nits (all stylistic -- feel free to skip)
> >
> > 1. In graph_path_prefix_is_feasible_with_new_element(), the
> > `if (!IS_EDGE_PATTERN(prev_pe->path_factor->kind)) return true;`
> > branch is unreachable -- gram.y enforces V-E alternation, and
> > the path-factor linkage code in
> > generate_queries_for_path_pattern() re-asserts it. Tightening
> > to `Assert(IS_EDGE_PATTERN(...))` would match the rest of the
> > file.
>
> Yeah, for a pattern such as (a)-[e1]-(b)-[e2]-(a)-[e3]-(c), the
> path_factors
> is V(a), E(e1), V(b), E(e2), E(e3), V(c), though not totally V-E
> alternation,
> for a Vertex, we can sure that the prev_pe is always Edge, so changed
> to Assert.
>
Thanks -- I hadn't realized path_factors isn't strictly V-E
alternating; your E(e2), E(e3) example pins down the actual
invariant the Assert relies on.
> >
> > 2. `normal_ok` / `reverse_ok` reads slightly off from this file's
> > diction. Consider `feasible` / `rev_feasible` -- the asymmetric
> > `rev_` prefix matches the existing `edge_qual` / `rev_edge_qual`
> > in generate_query_for_graph_path(), and echoes the function name
> > itself.
>
> Changed.
>
> >
> > 3. A one-line comment above the V-E alternation check (whether
> > kept as `if` or tightened per nit 1) citing parse analysis would
> > save the next reader a trip through gram.y.
>
> I added the following, feel free to suggest wording.
>
> /*
> * Merged duplicate vertices only drop redundant factors from path_factors,
> * not from the DFS path; preceding slot is always an edge for a vertex.
> */
>
+1, and actually a better framing than the gram.y pointer I
originally suggested -- this is the invariant the Assert relies on.
>
> >
> > 4. In the prune branch of generate_queries_for_path_pattern_recurse(),
> > a one-liner noting that an exhausted level returns an empty list,
> > which the caller routes to generate_query_for_empty_path_pattern(),
> > would help -- pre-existing behavior, but far more reachable after
> > the patch.
>
> Added, feel free to suggest wording.
>
Good.
>
> >
> > ## On the trailing edge_qual guard
> >
> > The `if (edge_qual == NULL) return NULL;` in
> > generate_query_for_graph_path() becomes unreachable after the patch,
> > but worth keeping as a defensive backstop -- it is the literal dual
> > of the helper's check, and removing it would only tighten coupling
> > for no measurable gain. A short comment marking it as such would help.
>
> Added.
>
Thanks.
With the ASCII fix in 0002, no further comments from my side.
Regards,
Henson
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alvaro Herrera | 2026-05-13 16:58:14 | Re: Adding REPACK [concurrently] |
| Previous Message | Álvaro Herrera | 2026-05-13 16:30:13 | Re: [PATCH] Fix errhint messages for REPACK (CONCURRENTLY) restrictions |