| From: | Henson Choi <assam258(at)gmail(dot)com> |
|---|---|
| To: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
| Cc: | Peter Eisentraut <peter(at)eisentraut(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>, Ajay Pal <ajay(dot)pal(dot)k(at)gmail(dot)com>, Imran Zaheer <imran(dot)zhir(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: SQL Property Graph Queries (SQL/PGQ) |
| Date: | 2026-01-07 14:16:11 |
| Message-ID: | CAAAe_zDs+Nq75Qt7K5VqUWytvRO_+9brJPU9WXSiCvEnDQ=s_w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi Ashutosh,
I analyzed the pg_upgrade coverage for property graph functions in
ruleutils.c. Here are the findings:
What's this test? It's not in the patchset. Also you might want to try
> 002_pg_upgrade.
1. UNCOVERED CODE ANALYSIS
--------------------------
The uncovered lines in ruleutils.c property graph functions fall into
four categories:
1.1 Defensive Code (5 lines)
Error handling that cannot be reached in normal operation.
Line Code Reason
-------------------------------------------------------------------------
1628 PG_RETURN_NULL() Invalid OID check
1710 elog(ERROR, "null pgekey") Parser prevents
1733 elog(ERROR, "cache lookup failed") System catalog guard
1750 elog(ERROR, "cache lookup failed") System catalog guard
7962-7963 elog(ERROR, "unrecognized node type") Default switch case
1.2 Dead Code (2 lines)
Unreachable due to design - implicit SOURCE/DEST is expanded during
DDL processing. FK lookup always populates pgesrckey/pgedestkey in
the catalog, so pg_get_propgraphdef() never sees NULL keys.
Line Code Reason
-------------------------------------------------------------------------
1744 appendStringInfo(... pgealias) Implicit SOURCE path
1761 appendStringInfo(... pgealias) Implicit DEST path
1.3 Unimplemented Features (~13 lines)
Parser accepts but rewriter rejects these patterns. Cannot be stored
as VIEW/RULE, so deparse code is unreachable.
This code will be used when these features are implemented.
Feature Lines Rewriter Rejection
-------------------------------------------------------------------------
Quantifier {m,n} 8046-8049 rewriteGraphTable.c:204-207
PAREN_EXPR 7994-7996, 8039-8041 rewriteGraphTable.c:201-202
Multiple Paths 8069 rewriteGraphTable.c:119-120
Subexpr 8013-8017 Part of PAREN_EXPR
1.4 Testable Code - Needs Coverage (4 features)
Supported features but not covered by current pg_upgrade tests.
Feature Lines Test Method
-------------------------------------------------------------------------
Left-pointing Edge 7987-7989, 8032-8035 VIEW + pg_get_viewdef()
Element WHERE 8020-8024 VIEW + pg_get_viewdef()
Graph Pattern WHERE 8078-8079 VIEW + pg_get_viewdef()
AS Alias (DDL) 1698-1700 pg_get_propgraphdef()
2. PROPOSED TEST CASES
----------------------
To cover the testable code paths:
-- Left-pointing edge (covers 7987-7989, 8032-8035)
CREATE VIEW v AS SELECT * FROM GRAPH_TABLE (g
MATCH (a)<-[e]-(b) COLUMNS (a.x, b.y));
-- Element WHERE on vertex (covers 8020-8024)
CREATE VIEW v AS SELECT * FROM GRAPH_TABLE (g
MATCH (a WHERE a.age > 25)-[e]->(b) COLUMNS (a.name));
-- Element WHERE on edge (covers 8020-8024)
CREATE VIEW v AS SELECT * FROM GRAPH_TABLE (g
MATCH (a)-[e WHERE e.since > 2020]->(b) COLUMNS (a.name));
-- Graph Pattern WHERE (covers 8078-8079)
CREATE VIEW v AS SELECT * FROM GRAPH_TABLE (g
MATCH (a)-[e]->(b) WHERE a.x > 10 COLUMNS (a.name));
-- AS Alias in DDL (covers 1698-1700)
CREATE PROPERTY GRAPH g VERTEX TABLES (t AS alias KEY (id));
3. VERIFICATION
---------------
I verified these patterns are testable by creating VIEWs and checking
pg_get_viewdef() output locally. All proposed test cases execute the
target code paths successfully.
4. SUMMARY
----------
Category Lines Status
---------------------------------------------------------
Defensive Code 5 Unreachable - error handling
Dead Code 2 Unreachable - design limitation
Unimplemented ~13 Unreachable - rewriter blocks
Testable 4 Recommend adding tests
---
End of Report
Best regards,
Henson
2026년 1월 7일 (수) PM 3:45, Henson Choi <assam258(at)gmail(dot)com>님이 작성:
>
> Hi Ashutosh,
>
> > Can you please check whether the uncovered lines are now covered?
>
> SQL/PGQ Coverage Verification Report
> =====================================
>
> I've re-run the coverage analysis with the latest patches (0001-0005).
> Here are the results:
>
>
> 1. COVERAGE COMPARISON
> ----------------------
>
> Previous Current Change
> ---------------------------------------------------------
> Overall Coverage 90.5% 91.4% +0.9%
> Covered Lines 2,029 2,049 +20
> Untested Lines 214 194 -20
>
>
> 2. KEY FILE IMPROVEMENTS
> ------------------------
>
> File Previous Current Change
> ---------------------------------------------------------
> propgraphcmds.c 94.6% 95.3% +0.7%
> rewriteGraphTable.c 94.6% 95.0% +0.4%
> ruleutils.c 85.1% 89.2% +4.1%
>
>
> 3. WHITE-BOX TEST VERIFICATION (Section 5.2)
> --------------------------------------------
>
> File:Line Test Case Status
>
> -------------------------------------------------------------------------------
> propgraphcmds.c:116 CREATE UNLOGGED attempt COVERED
>
> propgraphcmds.c:136 TEMP table in graph creation COVERED
> propgraphcmds.c:179 (same group)
> UNCOVERED [1]
> propgraphcmds.c:254-262 (same group) COVERED
>
> propgraphcmds.c:1323 ALTER ADD TEMP to perm graph COVERED
> propgraphcmds.c:1364 (same group)
> UNCOVERED [1]
>
> propgraphcmds.c:726-733 CREATE without LABEL clause
> UNREACHABLE [2]
>
> propgraphcmds.c:1300,1303 ALTER IF EXISTS non-existent
> UNREACHABLE [3]
>
> execMain.c:1162-1163 DML on graph
> UNREACHABLE [4]
>
> rewriteGraphTable.c:120 Multiple path patterns COVERED
>
> rewriteGraphTable.c:205 Quantifier usage COVERED
>
> ruleutils.c:7939-7961 Complex label VIEW deparsing COVERED
> ruleutils.c:7962-7963 (same group)
> DEFENSIVE [5]
>
> Summary: 7 covered, 3 unreachable, 2 uncovered, 1 defensive
>
> Notes:
> [1] TEMP edge table tests missing. Line 136/1323 cover vertex tables,
> but 179/1364 are the corresponding edge table branches.
> [2] Parser always creates default label entry (gram.y:9591-9601).
> The else-block at 726-733 is dead code; einfo->labels is never NIL.
> [3] Syntax not supported; only SET SCHEMA has IF EXISTS variant.
> [4] Graph reference only allowed in GRAPH_TABLE(), parser blocks DML.
> [5] Default case for unrecognized node type (elog ERROR).
>
>
> 4. UNTESTABLE CODE VERIFICATION (Section 5.3)
> ---------------------------------------------
>
> File:Line Reason Status
>
> -------------------------------------------------------------------------------
> rewriteGraphTable.c:202 PAREN_EXPR blocked at parser AGREED
> rewriteGraphTable.c:297,304,319 Cyclic pattern edge conflict AGREED
> rewriteGraphTable.c:801-818 get_gep_kind_name error only AGREED
> nodeFuncs.c:4585-4596,4755-4774 Walker branches internal AGREED
>
> Summary: 4 items confirmed unreachable as expected
>
>
> 5. REMAINING UNCOVERED LINES
> ----------------------------
>
> 5.1 Unreachable Code (by design):
> - propgraphcmds.c:726-733 - Parser creates default label (gram.y:9591)
> - propgraphcmds.c:1300,1303 - IF EXISTS syntax not supported
> - execMain.c:1162-1163 - Graph reference only in GRAPH_TABLE()
>
> 5.2 Defensive/Error-only Code:
> - get_gep_kind_name() - Error message function only
> - nodeFuncs.c walker - Internal implementation details
> - ruleutils.c:7962-7963 - Default case (elog ERROR)
>
> 5.3 Test Gaps (remaining):
> - propgraphcmds.c:179 - TEMP edge table in CREATE
> - propgraphcmds.c:1364 - TEMP edge table in ALTER ADD
>
>
> 6. RECOMMENDATIONS
> ------------------
>
> 6.1 Proposed Test Cases (Optional)
>
> To cover the 2 remaining uncovered lines:
>
> -- TEMP edge table in CREATE (covers line 179)
> CREATE TEMP TABLE te_src (id int PRIMARY KEY);
> CREATE TEMP TABLE te_dst (id int PRIMARY KEY);
> CREATE TEMP TABLE te (id int PRIMARY KEY, src int, dst int);
> CREATE PROPERTY GRAPH pg_temp_edge
> VERTEX TABLES (te_src, te_dst)
> EDGE TABLES (te KEY (id)
> SOURCE KEY (src) REFERENCES te_src (id)
> DESTINATION KEY (dst) REFERENCES te_dst (id));
>
> -- TEMP edge table in ALTER ADD (covers line 1364)
> CREATE TABLE pv1 (id int PRIMARY KEY);
> CREATE TABLE pv2 (id int PRIMARY KEY);
> CREATE TEMP TABLE pe (id int PRIMARY KEY, src int, dst int);
> CREATE PROPERTY GRAPH pg_perm VERTEX TABLES (pv1, pv2);
> ALTER PROPERTY GRAPH pg_perm ADD EDGE TABLES (pe KEY (id)
> SOURCE KEY (src) REFERENCES pv1 (id)
> DESTINATION KEY (dst) REFERENCES pv2 (id));
> -- Expected: ERROR (cannot add temp edge to permanent graph)
>
> 6.2 Dead Code Cleanup (Optional)
>
> propgraphcmds.c:726-733 is unreachable dead code. Consider replacing
> the else-block with defensive error handling:
>
> if (einfo->labels)
> {
> ...
> }
> else
> {
> /* Parser guarantees labels list is never empty */
> elog(ERROR, "unexpected empty labels list");
> }
>
> 6.3 Minor Issue
>
> Patch 0003 (ECPG test) is missing .gitignore entries.
> src/interfaces/ecpg/test/sql/.gitignore should include:
>
> /sqlpgq
> /sqlpgq.c
>
>
> 7. CONCLUSION
> -------------
>
> The coverage has improved from 90.5% to 91.4%. The additional tests in
> patch 0005 have addressed most coverage concerns raised in my initial
> report:
>
> - Section 5.2: 7 of 13 items now covered
> - Section 5.3: 4 items confirmed unreachable as expected
> - Remaining: 3 unreachable, 1 defensive, 2 uncovered
>
> Note on 726-733: Analysis of gram.y (lines 9591-9601) confirms the parser
> always creates a default label entry when no LABEL clause is specified.
> The else-block is dead code. Recommend replacing with elog(ERROR).
>
> The 2 remaining uncovered lines (179, 1364) are TEMP edge table branches.
> Adding edge-specific tests would complete the coverage.
>
>
> ---
> End of Report
>
> Best regards,
> Henson
>
> 2026년 1월 2일 (금) PM 5:48, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>님이
> 작성:
>
>> Hi Henson,
>>
>> Thanks for your systematic review.
>>
>> On Wed, Dec 31, 2025 at 11:53 AM Henson Choi <assam258(at)gmail(dot)com> wrote:
>> >
>> > Overall assessment: GOOD
>>
>> Glad to know that.
>>
>> > - Test Coverage: Good (90.5% line coverage, ~180 test cases)
>>
>> Thanks for the coverage analysis. Looks good right now. But see below.
>>
>> > 2.1 Critical / Major
>> > (None)
>>
>> Great.
>>
>> >
>> > 2.2 Minor Issues
>> >
>> > #1 [Code] src/backend/commands/propgraphcmds.c:1632
>> > FIXME: Missing index for plppropid column causes sequential scan.
>> > Decision needed: (a) add index, or (c) allow seq scan for rare path.
>>
>> The path is rare enough that I think we can allow the seq scan. Given
>> that Peter has marked it as FIXME, it seems we will keep it as is for
>> now, but will revisit if performance becomes an issue. Peter, please
>> correct me if I'm wrong.
>>
>> >
>> > #2 [Doc] doc/src/sgml/ref/alter_property_graph.sgml:286
>> > Compatibility section incorrectly states "CREATE PROPERTY GRAPH"
>> > Should be: "ALTER PROPERTY GRAPH"
>> >
>>
>> That's right. Fixed.
>>
>> > #3 [Doc] doc/src/sgml/ref/alter_property_graph.sgml:65
>> > Synopsis shows VERTEX/NODE and EDGE/RELATIONSHIP as separate clauses,
>> > but Description merges them as {VERTEX|NODE|EDGE|RELATIONSHIP}.
>>
>> Their syntax is slightly different hence they are separate in
>> Synopsis. If we separate them in the Description, we might have to
>> repeat the same text twice. Having them merged in the Description
>> makes it more concise without losing any clarity or meaning. I think
>> we can keep it as is. What do you think?
>>
>> >
>> > #4 [Doc] doc/src/sgml/ref/alter_property_graph.sgml:80
>> > Grammar error: "the graph removed" should be "the graph is removed"
>>
>> Right. Fixed.
>>
>> >
>> > #5 [Doc] doc/src/sgml/queries.sgml:2929,2931
>> > TODO comments for unimplemented features without clear limitation
>> notes.
>>
>> Those TODOs are not visible to users. I think they serve as
>> placeholders to add the documentation when the features are
>> implemented. We may want to remove them, but I see no harm in keeping
>> them for now. Peter, what do you think?
>>
>> >
>> > #6 [Doc] doc/src/sgml/ddl.sgml:5717
>> > Typo: "infrastucture" should be "infrastructure"
>> >
>>
>> Fixed. Thanks for catching it.
>>
>> > 2.3 Alternative Approaches for Discussion
>> >
>> > #1 Support CREATE PROPERTY GRAPH IF NOT EXISTS
>> > Rationale: PostgreSQL-style extension, consistent with other DDL.
>> >
>>
>> I think this will be good-to-have for consistency across DDLs.
>> However, I think it is better to discuss and implement it separately
>> since lack of it does not block the main functionality of property
>> graphs. Meantime users can DROP and CREATE. There are other DDLs which
>> do not have this support. What do you think?
>>
>> > #2 Return 0 rows for non-existent labels instead of error
>> > Current: SELECT * FROM GRAPH_TABLE(g MATCH (a:NonExistent) ...)
>> raises error
>> > Alternative: Return empty result set instead
>> > Rationale: Better for exploratory queries, similar to SQL WHERE on
>> > non-existent values returning 0 rows rather than error.
>>
>> I think this makes sense in a proper graph database where the labels
>> are not predefined. However, in property graphs the labels, the
>> properties associated with them and data types of those properties are
>> predefined in the graph schema. If the specified label does not exist
>> in the graph schema, we can not determine what properties are
>> associated with it and their data types. In turn we can not determine
>> the shape of the result set, which is essential for reporting even 0
>> rows. Hence the error instead of 0 rows. Oracle has similar behaviour.
>>
>> >
>> > #3 Return 0 rows when same variable has different labels
>> > Current: SELECT * FROM GRAPH_TABLE(g MATCH
>> (a:Person)-[e]-(a:Company) ...)
>> > raises error because variable 'a' has conflicting labels.
>> > Alternative: Return empty result set instead
>> > Rationale: Logically, a node cannot be both Person AND Company,
>> > so 0 rows is the correct result. Consistent with standard SQL
>> > WHERE semantics (impossible condition = 0 rows, not error).
>> >
>>
>> In such a case 0 rows isn't a correct result. According to the
>> standard, variable a is bound to all the nodes that have both the
>> labels. This is label conjunction, a feature we don't support right
>> now. Hence the "unsupported feature" error. I expect that some day we
>> will implement label conjunction, and then the same query will return
>> non-zero rows if there are nodes with both labels.
>>
>>
>> >
>> -------------------------------------------------------------------------------
>> > TEMP graph CREATE TEMP PROPERTY GRAPH Medium
>>
>> Done
>>
>> > TEMP graph TEMP graph with permanent table reference Medium
>>
>> Done
>>
>> > TEMP graph ALTER ADD permanent table to TEMP graph Medium
>>
>> Done
>>
>> > CASCADE DROP ... CASCADE (dependent view cascade) Low
>>
>> Done
>>
>> > RESTRICT DROP ... RESTRICT (dependent object error) Low
>>
>> Done, but without explicit RESTRICT keyword since it's default behavior.
>>
>> - Tests specifying NODE/RELATIONSHIP instead of VERTEX/EDGE (Low priority)
>>
>> Done
>>
>> > propgraphcmds.c:136,179,254-262 TEMP table in graph creation Auto
>> TEMP conversion
>>
>> Done
>>
>> > propgraphcmds.c:1323,1364 ALTER ADD TEMP to perm graph Error
>> branch
>>
>> Done
>>
>> > propgraphcmds.c:724-734 CREATE without LABEL clause
>> Default label else
>>
>> This should be covered. There are several CREATE PROPERTY GRAPH
>> statements without LABEL clauses.
>>
>> > propgraphcmds.c:1300,1303 ALTER IF EXISTS non-existent
>> missing_ok branch
>>
>> This is unreachable since the syntax is not supported. The only IF
>> EXISTS variant supported is ALTER PROPERTY GRAPH IF EXISTS ... SET
>> SCHEMA .... Added a test for the same.
>>
>> > propgraphcmds.c:116 CREATE UNLOGGED attempt
>> UNLOGGED error
>>
>> Done
>>
>> > execMain.c:1162-1163 DML on graph
>> RELKIND_PROPGRAPH
>>
>> This is unreachable since a property graph reference is only allowed
>> in GRAPH_TABLE().
>>
>> > rewriteGraphTable.c:120 Multiple path patterns
>> Length check
>>
>> Done
>>
>> > rewriteGraphTable.c:205 Quantifier usage
>> Quantifier error
>>
>> Done
>>
>> > ruleutils.c:7939-7963 Complex label VIEW deparsing
>> T_BoolExpr branch
>> >
>>
>> Done
>>
>> Can you please check whether the uncovered lines are now covered?
>>
>> > - Tests executed: run_pgq_tests.sql
>>
>> What's this test? It's not in the patchset. Also you might want to try
>> 002_pg_upgrade.
>>
>> >
>> > +----------------------------------------------------------+
>> > | SQL/PGQ Implementation: NO MEMORY LEAKS DETECTED |
>> > +----------------------------------------------------------+
>>
>> Great. Thanks for confirming.
>>
>> >>
>> >> On Fri, Dec 26, 2025 at 6:03 PM Henson Choi <assam258(at)gmail(dot)com>
>> wrote:
>> >> >
>> >> > 1. LABELS() function
>> >> > - Returns text[] of element labels
>> >> > - Fixed privilege checking from previous version
>> >> > - Enables optimizer pushdown for branch pruning
>> >> >
>> >> > 2. PROPERTY_NAMES() function
>> >> > - Returns text[] of property names
>> >> > - Similar approach to LABELS()
>> >> >
>> >>
>> >> I could not find specification of these functions in SQL/PGQ standard.
>> >> It's a large document and I might be missing something. Can you please
>> >> point me to the relevant sections?
>> >>
>> >
>> > You're correct - LABELS() and PROPERTY_NAMES() are not in the SQL/PGQ
>> > standard. I was inspired by similar functions in Cypher (labels(),
>> keys())
>> > and Oracle's PGQL (which has LABELS(), though Oracle is moving toward
>> > SQL/PGQ compliance as well).
>> >
>> > The attached code demonstrates that through query rewrite, these
>> functions
>> > can enable planner-level table pruning optimizations. This becomes
>> > particularly valuable for client applications - GUI tools could use
>> label
>> > names as host variables for flexible label-based filtering, and
>> > PROPERTY_NAMES() (similar to Cypher's keys()) would enable dynamic
>> property
>> > inspection for display or selective querying of elements with specific
>> > properties.
>> >
>> > While there's a distinction between structured (SQL/PGQ) and
>> semi-structured
>> > (Cypher) query models, I don't think we need to strictly exclude
>> > semi-structured patterns that work well within the structured framework.
>> > Whether this should be proposed as a future SQL/PGQ standard extension
>> or
>> > remain a PostgreSQL-specific extension is worth discussion. Given the
>> > practical utility demonstrated in graph query deployments, there might
>> be
>> > value in standardizing such introspection functions.
>>
>> While these functions are useful with semistructured frameworks like
>> Cypher, I am not sure how useful they are in structured framework like
>> SQL/PGQ. In SQL/PGQ, the graph schema is predefined and known to the
>> users. Hence users know what labels and properties exist in the graph.
>> However, if they become part of the standard, their chances of getting
>> accepted in the PostgreSQL core will increase.
>>
>> Even if we keep them out of core, I think we should be able to
>> implement them as stable functions in an extension and still benefit
>> from planner optimizations you mentioned.
>>
>> >
>> > Similarly, I'd suggest considering Cypher's SHORTEST PATH for future
>> > SQL/PGQ standards. If we treat SHORTEST PATH as a specialized join type
>> > in SQL and handle it through query rewrite, PostgreSQL's existing
>> planner
>> > infrastructure could support it efficiently. This approach has been
>> proven
>> > in practice and could be a valuable addition to the next standard
>> revision.
>>
>> IIRC there is SQL/PGQ standard specification for shortest path. We
>> should implement that when we get to it.
>>
>> >
>> > That was my finding as well - the architecture is well-designed for
>> > extensibility.
>>
>> Great. Thanks for the confirmation.
>>
>> >
>> > Having more people who can maintain and extend SQL/PGQ reduces the
>> > risk for both the community and PostgreSQL itself. When I have a revised
>> > version ready, would you be willing to review it again?
>> >
>>
>> We will need to prioritize the features to be implemented next. This
>> looks like a useful pattern to support. I believe this will appear
>> higher in the priority list, thus worth reviewing.
>>
>> The patches are thus
>> 0001 - same as previous except a. addition of new lines in
>> create_property_graph.sql for consistency with other sections in that
>> file, b. some minor corrections suggested by Hensen.
>> 0002 - fixes : references and improves documentation. I think we can
>> merge this into 0001 after a quick review from Peter
>> 0003 - ECPG test. This might need a bit of review. The patch is large
>> because of the .c and .err files.
>> 0004 - This reverts back ECPG and PSQL lexer changes introduced in
>> 0001. Does not show any failure even in the test case added by 0003.
>> Those changes seem unnecessary. Peter, can you please confirm.
>> 0005 - Additional test cases for code coverage as pointed out by
>> Hensen's report.
>>
>> --
>> Best Wishes,
>> Ashutosh Bapat
>>
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Aleksander Alekseev | 2026-01-07 14:42:27 | Re: [PATCH] Refactor SLRU to always use long file names |
| Previous Message | Greg Sabino Mullane | 2026-01-07 14:06:54 | Re: [PATCH] Provide support for trailing commas |