| From: | Henson Choi <assam258(at)gmail(dot)com> |
|---|---|
| To: | Alexander Lakhin <exclusion(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
| Cc: | Peter Eisentraut <peter(at)eisentraut(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>, Ajay Pal <ajay(dot)pal(dot)k(at)gmail(dot)com>, Imran Zaheer <imran(dot)zhir(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: SQL Property Graph Queries (SQL/PGQ) |
| Date: | 2026-03-21 12:48:51 |
| Message-ID: | CAAAe_zAfc0W7PZGvbgCB9zifDon-f3YBvzKZCSCNi6GRehdeUQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Alexander,
I've discovered an assertion failure produced with the following script:
> CREATE TABLE v1 (id int primary key, t text);
> CREATE TABLE v2 (id int primary key, t text);
> CREATE TABLE e1_2 (id_1 int, id_2 int, t text);
> CREATE PROPERTY GRAPH g1
> VERTEX TABLES (v1, v2)
> EDGE TABLES (
> e1_2 key (id_1, id_2)
> SOURCE KEY (id_1) REFERENCES v1 (id)
> DESTINATION KEY (id_2) REFERENCES v2 (id)
> );
>
> SELECT count(*) FROM GRAPH_TABLE (g1 MATCH ()-> COLUMNS (1 AS one));
I looked into this. The pattern `()->` causes an assertion failure
at rewriteGraphTable.c:783 because the edge's dest_pf is NULL -- there
is no destination vertex following the edge pattern.
Per SQL/PGQ standard (ISO/IEC 9075-16:2023), Section 10.6 Syntax
Rules 13c and 13d, implicit empty vertex patterns should be inserted
when an edge pattern is not preceded or followed by a vertex pattern
at the same depth of graph pattern matching. Specifically:
SR 13c: If an edge pattern EP contained in a <path term> PST at
the same depth of graph pattern matching is not preceded
by a <vertex pattern> contained in PST at the same depth
of graph pattern matching, then an implicit empty
<vertex pattern> VP is inserted in PST immediately
prior to EP.
SR 13d: If an edge pattern EP contained in a <path term> PST at
the same depth of graph pattern matching is not followed
by a <vertex pattern> contained in PST at the same depth
of graph pattern matching, then an implicit empty
<vertex pattern> VP is inserted in PST immediately
after EP.
So `()->` should be treated as `()->()`, and `->` as `()->()`.
This also applies to left-pointing edges (`<-`, `()<-`) and
any-direction edges (`-[]-`).
The crash affects all patterns where a vertex is missing adjacent
to an edge:
-- trailing edge (SR 13d)
MATCH ()-> MATCH ()<- MATCH ()-[]-
-- leading edge (SR 13c)
MATCH ->() MATCH <-()
-- edge only (SR 13c + 13d)
MATCH -> MATCH <- MATCH -[]-
-- consecutive edges (SR 13b, with separator to avoid SR 3)
MATCH ()-> ->()
In a non-assert build, this results in a segfault (NULL pointer
dereference at pf->dest_pf->factorpos).
I have a fix that inserts implicit empty vertex patterns in the
rewrite stage (not the parser), so that ruleutils can still
reconstruct the original user-written pattern from the parse tree.
The fix handles all three cases: SR 13b (consecutive edges),
SR 13c (leading edge), and SR 13d (trailing edge).
The change is in generate_queries_for_path_pattern(), before the
path factor linking loop:
/*
* Per SQL standard 10.6 SR 13b/c/d, insert implicit empty vertex
* patterns where needed:
* SR 13b: between two consecutive edge patterns
* SR 13c: before a leading edge pattern
* SR 13d: after a trailing edge pattern
*
* We do this in the rewrite stage (not the parser) so that ruleutils
* can reconstruct the original user-written pattern from the parse tree.
*/
{
List *expanded = NIL;
GraphElementPattern *prev_gep = NULL;
foreach_node(GraphElementPattern, gep, path_pattern)
{
/* SR 13c: leading edge needs a vertex before it */
/* SR 13b: consecutive edges need a vertex between them */
if (IS_EDGE_PATTERN(gep->kind) &&
(prev_gep == NULL || IS_EDGE_PATTERN(prev_gep->kind)))
{
GraphElementPattern *implicit_vp =
makeNode(GraphElementPattern);
implicit_vp->kind = VERTEX_PATTERN;
expanded = lappend(expanded, implicit_vp);
}
expanded = lappend(expanded, gep);
prev_gep = gep;
}
/* SR 13d: trailing edge needs a vertex after it */
if (prev_gep && IS_EDGE_PATTERN(prev_gep->kind))
{
GraphElementPattern *implicit_vp = makeNode(GraphElementPattern);
implicit_vp->kind = VERTEX_PATTERN;
expanded = lappend(expanded, implicit_vp);
}
path_pattern = expanded;
}
Separately, while testing various edge patterns I noticed that
`<-[]->` (full edge left or right) produces a syntax error:
SELECT count(*) FROM GRAPH_TABLE (g1 MATCH ()<-[]->() COLUMNS (1 AS one));
ERROR: syntax error at or near "->"
Per the standard BNF, this should be valid:
<full edge left or right> ::=
<left arrow bracket> <element pattern filler> <bracket right arrow>
i.e., `<-[]->` = `<-[` + `]->`. This is a separate parser issue,
not related to the implicit vertex insertion above.
All existing regression tests (graph_table, graph_table_rls, and
the full pg-check suite) pass with this change.
Ashutosh, since you designed the path factor linking logic in
generate_queries_for_path_pattern(), you would know best whether
this insertion point has any side effects on cyclic patterns or
the edge-vertex qual construction downstream. Could you take
this snippet and fold it into a proper patch if it looks right?
Regards,
Henson
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jianghua Yang | 2026-03-21 13:09:46 | Re: basebackup: add missing deflateEnd() in gzip compression sink |
| Previous Message | Henson Choi | 2026-03-21 12:42:52 | Re: Row pattern recognition |