Re: SQL Property Graph Queries (SQL/PGQ)

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: assam258(at)gmail(dot)com
Cc: Alexander Lakhin <exclusion(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>, Ajay Pal <ajay(dot)pal(dot)k(at)gmail(dot)com>, Imran Zaheer <imran(dot)zhir(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SQL Property Graph Queries (SQL/PGQ)
Date: 2026-03-23 15:00:10
Message-ID: CAExHW5tktsyaJRnfR1p3+H-BNB-KnFAs-FBdPwQqxpppp5PqaA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 21, 2026 at 6:19 PM Henson Choi <assam258(at)gmail(dot)com> wrote:
>
> Hi Alexander,
>
>> I've discovered an assertion failure produced with the following script:
>> CREATE TABLE v1 (id int primary key, t text);
>> CREATE TABLE v2 (id int primary key, t text);
>> CREATE TABLE e1_2 (id_1 int, id_2 int, t text);
>> CREATE PROPERTY GRAPH g1
>> VERTEX TABLES (v1, v2)
>> EDGE TABLES (
>> e1_2 key (id_1, id_2)
>> SOURCE KEY (id_1) REFERENCES v1 (id)
>> DESTINATION KEY (id_2) REFERENCES v2 (id)
>> );
>>
>> SELECT count(*) FROM GRAPH_TABLE (g1 MATCH ()-> COLUMNS (1 AS one));
>

Thanks for the report.

>
> I looked into this. The pattern `()->` causes an assertion failure
> at rewriteGraphTable.c:783 because the edge's dest_pf is NULL -- there
> is no destination vertex following the edge pattern.
>
> Per SQL/PGQ standard (ISO/IEC 9075-16:2023), Section 10.6 Syntax
> Rules 13c and 13d, implicit empty vertex patterns should be inserted
> when an edge pattern is not preceded or followed by a vertex pattern
> at the same depth of graph pattern matching. Specifically:
>
> SR 13c: If an edge pattern EP contained in a <path term> PST at
> the same depth of graph pattern matching is not preceded
> by a <vertex pattern> contained in PST at the same depth
> of graph pattern matching, then an implicit empty
> <vertex pattern> VP is inserted in PST immediately
> prior to EP.
>
> SR 13d: If an edge pattern EP contained in a <path term> PST at
> the same depth of graph pattern matching is not followed
> by a <vertex pattern> contained in PST at the same depth
> of graph pattern matching, then an implicit empty
> <vertex pattern> VP is inserted in PST immediately
> after EP.
>
> So `()->` should be treated as `()->()`, and `->` as `()->()`.
> This also applies to left-pointing edges (`<-`, `()<-`) and
> any-direction edges (`-[]-`).
>
> The crash affects all patterns where a vertex is missing adjacent
> to an edge:
>
> -- trailing edge (SR 13d)
> MATCH ()-> MATCH ()<- MATCH ()-[]-
>
> -- leading edge (SR 13c)
> MATCH ->() MATCH <-()
>
> -- edge only (SR 13c + 13d)
> MATCH -> MATCH <- MATCH -[]-
>
> -- consecutive edges (SR 13b, with separator to avoid SR 3)
> MATCH ()-> ->()
>
> In a non-assert build, this results in a segfault (NULL pointer
> dereference at pf->dest_pf->factorpos).
>
> I have a fix that inserts implicit empty vertex patterns in the
> rewrite stage (not the parser), so that ruleutils can still
> reconstruct the original user-written pattern from the parse tree.
> The fix handles all three cases: SR 13b (consecutive edges),
> SR 13c (leading edge), and SR 13d (trailing edge).
>
> The change is in generate_queries_for_path_pattern(), before the
> path factor linking loop:
>
> /*
> * Per SQL standard 10.6 SR 13b/c/d, insert implicit empty vertex
> * patterns where needed:
> * SR 13b: between two consecutive edge patterns
> * SR 13c: before a leading edge pattern
> * SR 13d: after a trailing edge pattern
> *
> * We do this in the rewrite stage (not the parser) so that ruleutils
> * can reconstruct the original user-written pattern from the parse tree.
> */
> {
> List *expanded = NIL;
> GraphElementPattern *prev_gep = NULL;
>
> foreach_node(GraphElementPattern, gep, path_pattern)
> {
> /* SR 13c: leading edge needs a vertex before it */
> /* SR 13b: consecutive edges need a vertex between them */
> if (IS_EDGE_PATTERN(gep->kind) &&
> (prev_gep == NULL || IS_EDGE_PATTERN(prev_gep->kind)))
> {
> GraphElementPattern *implicit_vp = makeNode(GraphElementPattern);
>
> implicit_vp->kind = VERTEX_PATTERN;
> expanded = lappend(expanded, implicit_vp);
> }
>
> expanded = lappend(expanded, gep);
> prev_gep = gep;
> }
>
> /* SR 13d: trailing edge needs a vertex after it */
> if (prev_gep && IS_EDGE_PATTERN(prev_gep->kind))
> {
> GraphElementPattern *implicit_vp = makeNode(GraphElementPattern);
>
> implicit_vp->kind = VERTEX_PATTERN;
> expanded = lappend(expanded, implicit_vp);
> }
>
> path_pattern = expanded;
> }

An empty pattern has no label expression which is equivalent to
disjunction of all applicable labels. If we add an empty pattern in
the rewrite phase, we will require its label expression to be resolved
in the rewrite phase. Looking at the proposed changes for all
properties reference [1], I am moving towards resolving the label
expression in the transformation stage itself. So I think it's better
to add the implicit vertex patterns during the transformation stage to
be future proof. Rewriter should just then rewrite the patterns into
joins and unions. Your point is correct that the implicit vertex
should not be visible in view definition or deparsed query. For that I
have added a flag implicit to GraphElementPattern.
GraphElementPatterns with implicit = true are ignored by ruleutils.
Implemented it that way in the attached patch.

Vertex patterns without an edge pattern in between are not supported
OTOH. Added an ereport for the same.

I have also added a few tests. I didn't add queries with all the
patterns you mentioned above. I tested a few by hand and all of them
worked as expected. Can you please check?

>
> Separately, while testing various edge patterns I noticed that
> `<-[]->` (full edge left or right) produces a syntax error:
>
> SELECT count(*) FROM GRAPH_TABLE (g1 MATCH ()<-[]->() COLUMNS (1 AS one));
> ERROR: syntax error at or near "->"
>
> Per the standard BNF, this should be valid:
>
> <full edge left or right> ::=
> <left arrow bracket> <element pattern filler> <bracket right arrow>
>
> i.e., `<-[]->` = `<-[` + `]->`. This is a separate parser issue,
> not related to the implicit vertex insertion above.

Yes. That's a grammar issue. gram.y doesn't support it. Peter, do you
remember or know the reason why we don't support full edge left or
right? In fact, I am wondering what's the difference between full edge
left or right and full edge any direction?

Here's cumulative patchset

0001 - Fixes the segmentation fault reported in [2]. There is some
difference of opinion about supporting non-local variable references
[3].
0002 - implicit vertex patterns as discussed in this email
0003: Cleanup patch proposed by Man Zeng [4]. Added some cosmetic
changes to ereports in parse_graphtable.c. It can wait to collect more
such cleanups before committing this.

[1] https://www.postgresql.org/message-id/CAExHW5tYCE9QyCvVraKUeesKW5RTR%2Bmrzsg3u64qSps-RPJR5A%40mail.gmail.com
[2] https://www.postgresql.org/message-id/tencent_0F596C3C557454CE7EC420B4@qq.com
[3] https://www.postgresql.org/message-id/CAAAe_zCA_9=JWBf4S5o_C2=EO=6dBZ6-=T0TUH+z5cvJkrtaUA@mail.gmail.com
[4] https://www.postgresql.org/message-id/tencent_6CA50FEF38D76F7C5BD8C765@qq.com
--
Best Wishes,
Ashutosh Bapat

Attachment Content-Type Size
v20260323-0002-Implicit-vertex-patterns.patch text/x-patch 12.5 KB
v20260323-0003-Cleanup-and-other-cosmetic-fixes.patch text/x-patch 6.2 KB
v20260323-0001-Cross-variable-references-in-graph-pattern.patch text/x-patch 4.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2026-03-23 15:09:30 Re: Add RISC-V Zbb popcount optimization
Previous Message Heikki Linnakangas 2026-03-23 14:41:18 Re: Stack-based tracking of per-node WAL/buffer usage