Quick Links

Re: SQL Property Graph Queries (SQL/PGQ)

From:	Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To:	assam258(at)gmail(dot)com
Cc:	Peter Eisentraut <peter(at)eisentraut(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>, Ajay Pal <ajay(dot)pal(dot)k(at)gmail(dot)com>, Imran Zaheer <imran(dot)zhir(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: SQL Property Graph Queries (SQL/PGQ)
Date:	2026-01-02 08:48:03
Message-ID:	CAExHW5vDB2WRWcT799jpcGcs2piXp8F1jdKZE+a9EGFx2PbjHg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi Henson,

Thanks for your systematic review.

On Wed, Dec 31, 2025 at 11:53 AM Henson Choi <assam258(at)gmail(dot)com> wrote:
>
> Overall assessment: GOOD

Glad to know that.

> - Test Coverage: Good (90.5% line coverage, ~180 test cases)

Thanks for the coverage analysis. Looks good right now. But see below.

> 2.1 Critical / Major
> (None)

Great.

>
> 2.2 Minor Issues
>
> #1 [Code] src/backend/commands/propgraphcmds.c:1632
> FIXME: Missing index for plppropid column causes sequential scan.
> Decision needed: (a) add index, or (c) allow seq scan for rare path.

The path is rare enough that I think we can allow the seq scan. Given
that Peter has marked it as FIXME, it seems we will keep it as is for
now, but will revisit if performance becomes an issue. Peter, please
correct me if I'm wrong.

>
> #2 [Doc] doc/src/sgml/ref/alter_property_graph.sgml:286
> Compatibility section incorrectly states "CREATE PROPERTY GRAPH"
> Should be: "ALTER PROPERTY GRAPH"
>

That's right. Fixed.

> #3 [Doc] doc/src/sgml/ref/alter_property_graph.sgml:65
> Synopsis shows VERTEX/NODE and EDGE/RELATIONSHIP as separate clauses,
> but Description merges them as {VERTEX|NODE|EDGE|RELATIONSHIP}.

Their syntax is slightly different hence they are separate in
Synopsis. If we separate them in the Description, we might have to
repeat the same text twice. Having them merged in the Description
makes it more concise without losing any clarity or meaning. I think
we can keep it as is. What do you think?

>
> #4 [Doc] doc/src/sgml/ref/alter_property_graph.sgml:80
> Grammar error: "the graph removed" should be "the graph is removed"

Right. Fixed.

>
> #5 [Doc] doc/src/sgml/queries.sgml:2929,2931
> TODO comments for unimplemented features without clear limitation notes.

Those TODOs are not visible to users. I think they serve as
placeholders to add the documentation when the features are
implemented. We may want to remove them, but I see no harm in keeping
them for now. Peter, what do you think?

>
> #6 [Doc] doc/src/sgml/ddl.sgml:5717
> Typo: "infrastucture" should be "infrastructure"
>

Fixed. Thanks for catching it.

> 2.3 Alternative Approaches for Discussion
>
> #1 Support CREATE PROPERTY GRAPH IF NOT EXISTS
> Rationale: PostgreSQL-style extension, consistent with other DDL.
>

I think this will be good-to-have for consistency across DDLs.
However, I think it is better to discuss and implement it separately
since lack of it does not block the main functionality of property
graphs. Meantime users can DROP and CREATE. There are other DDLs which
do not have this support. What do you think?

> #2 Return 0 rows for non-existent labels instead of error
> Current: SELECT * FROM GRAPH_TABLE(g MATCH (a:NonExistent) ...) raises error
> Alternative: Return empty result set instead
> Rationale: Better for exploratory queries, similar to SQL WHERE on
> non-existent values returning 0 rows rather than error.

I think this makes sense in a proper graph database where the labels
are not predefined. However, in property graphs the labels, the
properties associated with them and data types of those properties are
predefined in the graph schema. If the specified label does not exist
in the graph schema, we can not determine what properties are
associated with it and their data types. In turn we can not determine
the shape of the result set, which is essential for reporting even 0
rows. Hence the error instead of 0 rows. Oracle has similar behaviour.

>
> #3 Return 0 rows when same variable has different labels
> Current: SELECT * FROM GRAPH_TABLE(g MATCH (a:Person)-[e]-(a:Company) ...)
> raises error because variable 'a' has conflicting labels.
> Alternative: Return empty result set instead
> Rationale: Logically, a node cannot be both Person AND Company,
> so 0 rows is the correct result. Consistent with standard SQL
> WHERE semantics (impossible condition = 0 rows, not error).
>

In such a case 0 rows isn't a correct result. According to the
standard, variable a is bound to all the nodes that have both the
labels. This is label conjunction, a feature we don't support right
now. Hence the "unsupported feature" error. I expect that some day we
will implement label conjunction, and then the same query will return
non-zero rows if there are nodes with both labels.

> -------------------------------------------------------------------------------
> TEMP graph CREATE TEMP PROPERTY GRAPH Medium

Done

> TEMP graph TEMP graph with permanent table reference Medium

Done

> TEMP graph ALTER ADD permanent table to TEMP graph Medium

Done

> CASCADE DROP ... CASCADE (dependent view cascade) Low

Done

> RESTRICT DROP ... RESTRICT (dependent object error) Low

Done, but without explicit RESTRICT keyword since it's default behavior.

- Tests specifying NODE/RELATIONSHIP instead of VERTEX/EDGE (Low priority)

Done

> propgraphcmds.c:136,179,254-262 TEMP table in graph creation Auto TEMP conversion

Done

> propgraphcmds.c:1323,1364 ALTER ADD TEMP to perm graph Error branch

Done

> propgraphcmds.c:724-734 CREATE without LABEL clause Default label else

This should be covered. There are several CREATE PROPERTY GRAPH
statements without LABEL clauses.

> propgraphcmds.c:1300,1303 ALTER IF EXISTS non-existent missing_ok branch

This is unreachable since the syntax is not supported. The only IF
EXISTS variant supported is ALTER PROPERTY GRAPH IF EXISTS ... SET
SCHEMA .... Added a test for the same.

> propgraphcmds.c:116 CREATE UNLOGGED attempt UNLOGGED error

Done

> execMain.c:1162-1163 DML on graph RELKIND_PROPGRAPH

This is unreachable since a property graph reference is only allowed
in GRAPH_TABLE().

> rewriteGraphTable.c:120 Multiple path patterns Length check

Done

> rewriteGraphTable.c:205 Quantifier usage Quantifier error

Done

> ruleutils.c:7939-7963 Complex label VIEW deparsing T_BoolExpr branch
>

Done

Can you please check whether the uncovered lines are now covered?

> - Tests executed: run_pgq_tests.sql

What's this test? It's not in the patchset. Also you might want to try
002_pg_upgrade.

>
> +----------------------------------------------------------+
> | SQL/PGQ Implementation: NO MEMORY LEAKS DETECTED |
> +----------------------------------------------------------+

Great. Thanks for confirming.

>>
>> On Fri, Dec 26, 2025 at 6:03 PM Henson Choi <assam258(at)gmail(dot)com> wrote:
>> >
>> > 1. LABELS() function
>> > - Returns text[] of element labels
>> > - Fixed privilege checking from previous version
>> > - Enables optimizer pushdown for branch pruning
>> >
>> > 2. PROPERTY_NAMES() function
>> > - Returns text[] of property names
>> > - Similar approach to LABELS()
>> >
>>
>> I could not find specification of these functions in SQL/PGQ standard.
>> It's a large document and I might be missing something. Can you please
>> point me to the relevant sections?
>>
>
> You're correct - LABELS() and PROPERTY_NAMES() are not in the SQL/PGQ
> standard. I was inspired by similar functions in Cypher (labels(), keys())
> and Oracle's PGQL (which has LABELS(), though Oracle is moving toward
> SQL/PGQ compliance as well).
>
> The attached code demonstrates that through query rewrite, these functions
> can enable planner-level table pruning optimizations. This becomes
> particularly valuable for client applications - GUI tools could use label
> names as host variables for flexible label-based filtering, and
> PROPERTY_NAMES() (similar to Cypher's keys()) would enable dynamic property
> inspection for display or selective querying of elements with specific
> properties.
>
> While there's a distinction between structured (SQL/PGQ) and semi-structured
> (Cypher) query models, I don't think we need to strictly exclude
> semi-structured patterns that work well within the structured framework.
> Whether this should be proposed as a future SQL/PGQ standard extension or
> remain a PostgreSQL-specific extension is worth discussion. Given the
> practical utility demonstrated in graph query deployments, there might be
> value in standardizing such introspection functions.

While these functions are useful with semistructured frameworks like
Cypher, I am not sure how useful they are in structured framework like
SQL/PGQ. In SQL/PGQ, the graph schema is predefined and known to the
users. Hence users know what labels and properties exist in the graph.
However, if they become part of the standard, their chances of getting
accepted in the PostgreSQL core will increase.

Even if we keep them out of core, I think we should be able to
implement them as stable functions in an extension and still benefit
from planner optimizations you mentioned.

>
> Similarly, I'd suggest considering Cypher's SHORTEST PATH for future
> SQL/PGQ standards. If we treat SHORTEST PATH as a specialized join type
> in SQL and handle it through query rewrite, PostgreSQL's existing planner
> infrastructure could support it efficiently. This approach has been proven
> in practice and could be a valuable addition to the next standard revision.

IIRC there is SQL/PGQ standard specification for shortest path. We
should implement that when we get to it.

>
> That was my finding as well - the architecture is well-designed for
> extensibility.

Great. Thanks for the confirmation.

>
> Having more people who can maintain and extend SQL/PGQ reduces the
> risk for both the community and PostgreSQL itself. When I have a revised
> version ready, would you be willing to review it again?
>

We will need to prioritize the features to be implemented next. This
looks like a useful pattern to support. I believe this will appear
higher in the priority list, thus worth reviewing.

The patches are thus
0001 - same as previous except a. addition of new lines in
create_property_graph.sql for consistency with other sections in that
file, b. some minor corrections suggested by Hensen.
0002 - fixes : references and improves documentation. I think we can
merge this into 0001 after a quick review from Peter
0003 - ECPG test. This might need a bit of review. The patch is large
because of the .c and .err files.
0004 - This reverts back ECPG and PSQL lexer changes introduced in
0001. Does not show any failure even in the test case added by 0003.
Those changes seem unnecessary. Peter, can you please confirm.
0005 - Additional test cases for code coverage as pointed out by
Hensen's report.

--
Best Wishes,
Ashutosh Bapat

Attachment	Content-Type	Size
v20260102-0001-WIP-SQL-Property-Graph-Queries-SQL-PGQ.patch	text/x-patch	663.0 KB
v20260102-0005-More-test-coverage.patch	text/x-patch	15.2 KB
v20260102-0004-Possibly-unnecessary-changes.patch	text/x-patch	1.5 KB
v20260102-0003-Test-SQL-PGQ-with-ECPG.patch	text/x-patch	27.2 KB
v20260102-0002-Fix-some-more-references-and-edit-document.patch	text/x-patch	6.2 KB

In response to

Re: SQL Property Graph Queries (SQL/PGQ) at 2026-01-01 12:16:19 from Henson Choi

Responses

Re: SQL Property Graph Queries (SQL/PGQ) at 2026-01-02 13:23:39 from Henson Choi
Re: SQL Property Graph Queries (SQL/PGQ) at 2026-01-07 06:45:53 from Henson Choi
Re: SQL Property Graph Queries (SQL/PGQ) at 2026-01-09 04:52:30 from Junwang Zhao
Re: SQL Property Graph Queries (SQL/PGQ) at 2026-01-13 10:17:30 from Peter Eisentraut
Re: SQL Property Graph Queries (SQL/PGQ) at 2026-01-13 10:23:25 from Peter Eisentraut

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Xuneng Zhou	2026-01-02 09:17:34	Re: Implement waiting for wal lsn replay: reloaded
Previous Message	jian he	2026-01-02 08:05:17	Re: Non-text mode for pg_dumpall