Re: SQL Property Graph Queries (SQL/PGQ)

From: Henson Choi <assam258(at)gmail(dot)com>
To: Alexander Lakhin <exclusion(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>, Ajay Pal <ajay(dot)pal(dot)k(at)gmail(dot)com>, Imran Zaheer <imran(dot)zhir(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SQL Property Graph Queries (SQL/PGQ)
Date: 2026-03-21 12:48:51
Message-ID: CAAAe_zAfc0W7PZGvbgCB9zifDon-f3YBvzKZCSCNi6GRehdeUQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Alexander,

I've discovered an assertion failure produced with the following script:
> CREATE TABLE v1 (id int primary key, t text);
> CREATE TABLE v2 (id int primary key, t text);
> CREATE TABLE e1_2 (id_1 int, id_2 int, t text);
> CREATE PROPERTY GRAPH g1
> VERTEX TABLES (v1, v2)
> EDGE TABLES (
> e1_2 key (id_1, id_2)
> SOURCE KEY (id_1) REFERENCES v1 (id)
> DESTINATION KEY (id_2) REFERENCES v2 (id)
> );
>
> SELECT count(*) FROM GRAPH_TABLE (g1 MATCH ()-> COLUMNS (1 AS one));

I looked into this. The pattern `()->` causes an assertion failure
at rewriteGraphTable.c:783 because the edge's dest_pf is NULL -- there
is no destination vertex following the edge pattern.

Per SQL/PGQ standard (ISO/IEC 9075-16:2023), Section 10.6 Syntax
Rules 13c and 13d, implicit empty vertex patterns should be inserted
when an edge pattern is not preceded or followed by a vertex pattern
at the same depth of graph pattern matching. Specifically:

SR 13c: If an edge pattern EP contained in a <path term> PST at
the same depth of graph pattern matching is not preceded
by a <vertex pattern> contained in PST at the same depth
of graph pattern matching, then an implicit empty
<vertex pattern> VP is inserted in PST immediately
prior to EP.

SR 13d: If an edge pattern EP contained in a <path term> PST at
the same depth of graph pattern matching is not followed
by a <vertex pattern> contained in PST at the same depth
of graph pattern matching, then an implicit empty
<vertex pattern> VP is inserted in PST immediately
after EP.

So `()->` should be treated as `()->()`, and `->` as `()->()`.
This also applies to left-pointing edges (`<-`, `()<-`) and
any-direction edges (`-[]-`).

The crash affects all patterns where a vertex is missing adjacent
to an edge:

-- trailing edge (SR 13d)
MATCH ()-> MATCH ()<- MATCH ()-[]-

-- leading edge (SR 13c)
MATCH ->() MATCH <-()

-- edge only (SR 13c + 13d)
MATCH -> MATCH <- MATCH -[]-

-- consecutive edges (SR 13b, with separator to avoid SR 3)
MATCH ()-> ->()

In a non-assert build, this results in a segfault (NULL pointer
dereference at pf->dest_pf->factorpos).

I have a fix that inserts implicit empty vertex patterns in the
rewrite stage (not the parser), so that ruleutils can still
reconstruct the original user-written pattern from the parse tree.
The fix handles all three cases: SR 13b (consecutive edges),
SR 13c (leading edge), and SR 13d (trailing edge).

The change is in generate_queries_for_path_pattern(), before the
path factor linking loop:

/*
* Per SQL standard 10.6 SR 13b/c/d, insert implicit empty vertex
* patterns where needed:
* SR 13b: between two consecutive edge patterns
* SR 13c: before a leading edge pattern
* SR 13d: after a trailing edge pattern
*
* We do this in the rewrite stage (not the parser) so that ruleutils
* can reconstruct the original user-written pattern from the parse tree.
*/
{
List *expanded = NIL;
GraphElementPattern *prev_gep = NULL;

foreach_node(GraphElementPattern, gep, path_pattern)
{
/* SR 13c: leading edge needs a vertex before it */
/* SR 13b: consecutive edges need a vertex between them */
if (IS_EDGE_PATTERN(gep->kind) &&
(prev_gep == NULL || IS_EDGE_PATTERN(prev_gep->kind)))
{
GraphElementPattern *implicit_vp =
makeNode(GraphElementPattern);

implicit_vp->kind = VERTEX_PATTERN;
expanded = lappend(expanded, implicit_vp);
}

expanded = lappend(expanded, gep);
prev_gep = gep;
}

/* SR 13d: trailing edge needs a vertex after it */
if (prev_gep && IS_EDGE_PATTERN(prev_gep->kind))
{
GraphElementPattern *implicit_vp = makeNode(GraphElementPattern);

implicit_vp->kind = VERTEX_PATTERN;
expanded = lappend(expanded, implicit_vp);
}

path_pattern = expanded;
}

Separately, while testing various edge patterns I noticed that
`<-[]->` (full edge left or right) produces a syntax error:

SELECT count(*) FROM GRAPH_TABLE (g1 MATCH ()<-[]->() COLUMNS (1 AS one));
ERROR: syntax error at or near "->"

Per the standard BNF, this should be valid:

<full edge left or right> ::=
<left arrow bracket> <element pattern filler> <bracket right arrow>

i.e., `<-[]->` = `<-[` + `]->`. This is a separate parser issue,
not related to the implicit vertex insertion above.

All existing regression tests (graph_table, graph_table_rls, and
the full pg-check suite) pass with this change.

Ashutosh, since you designed the path factor linking logic in
generate_queries_for_path_pattern(), you would know best whether
this insertion point has any side effects on cyclic patterns or
the edge-vertex qual construction downstream. Could you take
this snippet and fold it into a proper patch if it looks right?

Regards,
Henson

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jianghua Yang 2026-03-21 13:09:46 Re: basebackup: add missing deflateEnd() in gzip compression sink
Previous Message Henson Choi 2026-03-21 12:42:52 Re: Row pattern recognition