Re: [SQL/PGQ] Early pruning for GRAPH_TABLE path generation

From: Junwang Zhao <zhjwpku(at)gmail(dot)com>
To: assam258(at)gmail(dot)com
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, SATYANARAYANA NARLAPURAM <satyanarlapuram(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Re: [SQL/PGQ] Early pruning for GRAPH_TABLE path generation
Date: 2026-05-14 00:38:41
Message-ID: CAEG8a3JU3zDjxRCh6hzgw1xo9Y6OBKhV3o1fLdtZxyHYZBAq+w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Henson,

On Thu, May 14, 2026 at 12:32 AM Henson Choi <assam258(at)gmail(dot)com> wrote:
>
> Hi Junwang,
>
> Thanks for the quick turnaround on v2. Walked through it against the
> prior notes -- everything reads well; one small thing on 0002 to fix
> before the next spin, inline below.
>
>> > 1. A long chain pattern that survives pruning but used to enumerate
>> > the full N^K product (the g5 chain from your benchmark is a
>> > natural fit). Not currently covered in
>> > src/test/regress/sql/graph_table.sql -- worth adding.
>>
>> Added as v2-0002.
>
>
> The g5 chain is a good fit. One nit: the comment above the SELECT
> has a U+2192 arrow in both the .sql and the .out -- unless a regress
> case is specifically exercising encoding, these stay ASCII-only, so
> please replace with a plain `->` in both files.

Agreed, I've applied the change in my machine, will wait for some more
comments and send a new version together later.

>
>>
>> I added the following paragraph to the commit message, feel free to
>> recommend wording.
>>
>> The cyclic case where a closing edge has both endpoints already
>> in the prefix is already exercised by the existing same-variable
>> loop patterns in graph_table.sql (e.g. (a)-[b]->(a)-[b]->(a)),
>> because repeated vertex names are merged into a single path factor
>> before DFS. Likewise, the "all edge candidates pruned" path into
>> generate_query_for_empty_path_pattern() is already hit by the
>> MATCH (o IS orders)-[IS customer_orders]->(c IS customers) case,
>> where no catalog edge matches the declared direction; pruning just
>> makes those branches easier to reach. Neither case needs a
>> dedicated new test beyond what is already there.
>
>
> Good.
>
>>
>> >
>> > ## Nits (all stylistic -- feel free to skip)
>> >
>> > 1. In graph_path_prefix_is_feasible_with_new_element(), the
>> > `if (!IS_EDGE_PATTERN(prev_pe->path_factor->kind)) return true;`
>> > branch is unreachable -- gram.y enforces V-E alternation, and
>> > the path-factor linkage code in
>> > generate_queries_for_path_pattern() re-asserts it. Tightening
>> > to `Assert(IS_EDGE_PATTERN(...))` would match the rest of the
>> > file.
>>
>> Yeah, for a pattern such as (a)-[e1]-(b)-[e2]-(a)-[e3]-(c), the path_factors
>> is V(a), E(e1), V(b), E(e2), E(e3), V(c), though not totally V-E alternation,
>> for a Vertex, we can sure that the prev_pe is always Edge, so changed
>> to Assert.
>
>
> Thanks -- I hadn't realized path_factors isn't strictly V-E
> alternating; your E(e2), E(e3) example pins down the actual
> invariant the Assert relies on.
>
>>
>> >
>> > 2. `normal_ok` / `reverse_ok` reads slightly off from this file's
>> > diction. Consider `feasible` / `rev_feasible` -- the asymmetric
>> > `rev_` prefix matches the existing `edge_qual` / `rev_edge_qual`
>> > in generate_query_for_graph_path(), and echoes the function name
>> > itself.
>>
>> Changed.
>>
>> >
>> > 3. A one-line comment above the V-E alternation check (whether
>> > kept as `if` or tightened per nit 1) citing parse analysis would
>> > save the next reader a trip through gram.y.
>>
>> I added the following, feel free to suggest wording.
>>
>> /*
>> * Merged duplicate vertices only drop redundant factors from path_factors,
>> * not from the DFS path; preceding slot is always an edge for a vertex.
>> */
>
>
> +1, and actually a better framing than the gram.y pointer I
> originally suggested -- this is the invariant the Assert relies on.
>
>>
>>
>> >
>> > 4. In the prune branch of generate_queries_for_path_pattern_recurse(),
>> > a one-liner noting that an exhausted level returns an empty list,
>> > which the caller routes to generate_query_for_empty_path_pattern(),
>> > would help -- pre-existing behavior, but far more reachable after
>> > the patch.
>>
>> Added, feel free to suggest wording.
>
>
> Good.
>
>>
>>
>> >
>> > ## On the trailing edge_qual guard
>> >
>> > The `if (edge_qual == NULL) return NULL;` in
>> > generate_query_for_graph_path() becomes unreachable after the patch,
>> > but worth keeping as a defensive backstop -- it is the literal dual
>> > of the helper's check, and removing it would only tighten coupling
>> > for no measurable gain. A short comment marking it as such would help.
>>
>> Added.
>
>
> Thanks.
>
> With the ASCII fix in 0002, no further comments from my side.

Will do, thanks :-)

>
> Regards,
> Henson

--
Regards
Junwang Zhao

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Smith 2026-05-14 01:54:19 Re: Proposal: Conflict log history table for Logical Replication
Previous Message Henson Choi 2026-05-14 00:08:04 Re: [PATCH] Add pg_current_vxact_id() function to expose virtual transaction IDs