Re: SQL Property Graph Queries (SQL/PGQ)

From: Henson Choi <assam258(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>, Ajay Pal <ajay(dot)pal(dot)k(at)gmail(dot)com>, Imran Zaheer <imran(dot)zhir(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SQL Property Graph Queries (SQL/PGQ)
Date: 2025-12-31 06:23:13
Message-ID: CAAAe_zCPm3W7ZTSBfKUgVJPrTn4-D99agGsz5P2MBLZN=tME3g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

SQL/PGQ Patch Review Report
============================

Patch: SQL Property Graph Queries (SQL/PGQ)
Commitfest: https://commitfest.postgresql.org/patch/4904

Review Methodology:
This review focused on quality assessment, not line-by-line code audit.
Key code paths and quality issues were examined with surrounding context
when concerns arose. Documentation files were reviewed with AI-assisted
grammar and typo checking. Code coverage was measured using gcov and
custom analysis tools.

Limitations:
- Not a security audit or formal verification
- Parser and executor internals reviewed at module level, not exhaustively
- Performance characteristics not benchmarked

TABLE OF CONTENTS
-----------------

1. Executive Summary
2. Issues Found
2.1 Critical / Major
2.2 Minor Issues
2.3 Suggestions for Discussion
3. Test Coverage
3.1 Covered Areas
3.2 Untested Items
3.3 Unimplemented Features (No Test Needed)
4. Code Coverage Analysis
4.1 Overall Coverage
4.2 Coverage by File
4.3 Uncovered Code Risk Assessment
5. Recommended Additional Tests
5.1 Black-box Tests (Functional)
5.2 White-box Tests (Coverage)
5.3 Untestable Code (Defensive)
6. Recommendations
7. Conclusion

1. EXECUTIVE SUMMARY
--------------------

Overall assessment: GOOD

The SQL/PGQ patch demonstrates solid implementation quality within its
defined scope. Code follows PostgreSQL coding standards, test coverage
is comprehensive at 90.5%, and documentation is thorough with only
minor typos.

Rating by Area:
- Code Quality: Excellent (PostgreSQL style compliant, 1 FIXME)
- Test Coverage: Good (90.5% line coverage, ~180 test cases)
- Documentation: Good (Complete, 5 minor issues)
- Build/Regress: Pass (make check-world, 56 test suites passed)

2. ISSUES FOUND
---------------

2.1 Critical / Major
(None)

2.2 Minor Issues

#1 [Code] src/backend/commands/propgraphcmds.c:1632
FIXME: Missing index for plppropid column causes sequential scan.
Decision needed: (a) add index, or (c) allow seq scan for rare path.

#2 [Doc] doc/src/sgml/ref/alter_property_graph.sgml:286
Compatibility section incorrectly states "CREATE PROPERTY GRAPH"
Should be: "ALTER PROPERTY GRAPH"

#3 [Doc] doc/src/sgml/ref/alter_property_graph.sgml:65
Synopsis shows VERTEX/NODE and EDGE/RELATIONSHIP as separate clauses,
but Description merges them as {VERTEX|NODE|EDGE|RELATIONSHIP}.

#4 [Doc] doc/src/sgml/ref/alter_property_graph.sgml:80
Grammar error: "the graph removed" should be "the graph is removed"

#5 [Doc] doc/src/sgml/queries.sgml:2929,2931
TODO comments for unimplemented features without clear limitation notes.

#6 [Doc] doc/src/sgml/ddl.sgml:5717
Typo: "infrastucture" should be "infrastructure"

2.3 Alternative Approaches for Discussion

#1 Support CREATE PROPERTY GRAPH IF NOT EXISTS
Rationale: PostgreSQL-style extension, consistent with other DDL.

#2 Return 0 rows for non-existent labels instead of error
Current: SELECT * FROM GRAPH_TABLE(g MATCH (a:NonExistent) ...) raises
error
Alternative: Return empty result set instead
Rationale: Better for exploratory queries, similar to SQL WHERE on
non-existent values returning 0 rows rather than error.

#3 Return 0 rows when same variable has different labels
Current: SELECT * FROM GRAPH_TABLE(g MATCH (a:Person)-[e]-(a:Company)
...)
raises error because variable 'a' has conflicting labels.
Alternative: Return empty result set instead
Rationale: Logically, a node cannot be both Person AND Company,
so 0 rows is the correct result. Consistent with standard SQL
WHERE semantics (impossible condition = 0 rows, not error).

3. TEST COVERAGE
----------------

3.1 Covered Areas

- CREATE PROPERTY GRAPH: Empty graph, VERTEX/EDGE TABLES, KEY, LABEL,
PROPERTIES
- ALTER PROPERTY GRAPH: ADD/DROP VERTEX/EDGE, ADD/DROP LABEL, ADD/DROP
PROPERTIES
- DROP PROPERTY GRAPH: Basic DROP, IF EXISTS
- RENAME TO / SET SCHEMA: Tested in alter_generic.sql
- OWNER TO: Permission change tests
- GRAPH_TABLE queries: Pattern matching, WHERE, COLUMNS, LATERAL, etc.
- RLS integration: PERMISSIVE/RESTRICTIVE policies, inheritance/partition
tables
- Error cases: Type/collation mismatch, non-existent objects, etc.
- psql integration: \dG, \d commands
- pg_dump: Tested in t/002_pg_dump.pl

3.2 Untested Items

- CREATE TEMP PROPERTY GRAPH (Medium priority)
- DROP ... CASCADE (Low priority)
- DROP ... RESTRICT (Low priority)

3.3 Unimplemented Features (No Test Needed)

- Multiple path patterns: Parser supports, execution not implemented
- Quantifier {n,m}: Parser supports, execution not implemented

4. CODE COVERAGE ANALYSIS
-------------------------

4.1 Overall Coverage

90.5% (2,029 / 2,243 lines)

4.2 Coverage by File

propgraphcmds.c: 94.6% (DDL command processing)
rewriteGraphTable.c: 94.6% (GRAPH_TABLE rewrite)
parse_graphtable.c: 97.6% (Graph query parser)
parse_clause.c: 98.0% (FROM clause parser)
ruleutils.c: 85.1% (pg_get_propgraphdef, etc.)
pg_dump.c: 84.0% (Dump/restore)
describe.c: 88.9% (psql \dG command)

4.3 Uncovered Code Risk Assessment

Low Risk (13 items):
- Defensive code, debug functions, unimplemented feature guards

Medium Risk (1 item):
- TEMP graph functionality untested

Conclusion: Most uncovered code consists of error handling, unimplemented
feature guards, and debug utilities. No security or functional risk.

5. RECOMMENDED ADDITIONAL TESTS
-------------------------------

5.1 Black-box Tests (Functional)

Based on feature specification, independent of code structure.

Feature Test Case Priority
-------------------------------------------------------------------------------
TEMP graph CREATE TEMP PROPERTY GRAPH Medium
TEMP graph TEMP graph with permanent table reference Medium
TEMP graph ALTER ADD permanent table to TEMP graph Medium
CASCADE DROP ... CASCADE (dependent view cascade) Low
RESTRICT DROP ... RESTRICT (dependent object error) Low

5.2 White-box Tests (Coverage)

Based on uncovered code paths identified in coverage analysis.

File:Line Test Case Target
Branch
-------------------------------------------------------------------------------
propgraphcmds.c:136,179,254-262 TEMP table in graph creation Auto TEMP
conversion
propgraphcmds.c:1323,1364 ALTER ADD TEMP to perm graph Error
branch
propgraphcmds.c:724-734 CREATE without LABEL clause Default
label else
propgraphcmds.c:1300,1303 ALTER IF EXISTS non-existent
missing_ok branch
propgraphcmds.c:116 CREATE UNLOGGED attempt UNLOGGED
error
execMain.c:1162-1163 DML on graph
RELKIND_PROPGRAPH
rewriteGraphTable.c:120 Multiple path patterns Length
check
rewriteGraphTable.c:205 Quantifier usage
Quantifier error
ruleutils.c:7939-7963 Complex label VIEW deparsing
T_BoolExpr branch

5.3 Untestable Code (Defensive)

File:Line Reason
-------------------------------------------------------------------------------
rewriteGraphTable.c:202 PAREN_EXPR blocked at parser level
rewriteGraphTable.c:297,304,319 Cyclic pattern edge conflict -
unreachable
rewriteGraphTable.c:801-818 get_gep_kind_name - error path only
nodeFuncs.c:4585-4596,4755-4774 Walker branches - internal implementation

6. RECOMMENDATIONS
------------------

6.1 Code Review Required (Minor, Decision Needed)

Location: propgraphcmds.c:1632
Issue: FIXME - No single-column index for plppropid
Current: InvalidOid causes sequential scan
Purpose: Check property references across all labels when deleting orphaned
properties
Options:
(a) Add index: Create new single-column index on plppropid
(b) Use existing: NOT POSSIBLE (cannot specify plpellabelid, need all
labels)
(c) Allow: Rare path (property deletion), sequential scan acceptable

6.2 Documentation Fixes (Minor)

- alter_property_graph.sgml: 3 typos/errors to fix
- queries.sgml: Add clear limitation notes for unimplemented features
- ddl.sgml: Fix "infrastucture" typo

6.3 Test Additions (Optional)

- CREATE TEMP PROPERTY GRAPH tests
- DROP ... CASCADE/RESTRICT tests

7. CONCLUSION
-------------

Test Quality: GOOD

Core functionality is thoroughly tested with comprehensive error case and
security (RLS) integration tests.

The patch is well-implemented within its defined scope. Identified issues
are minor documentation typos and one code decision point (FIXME).
No critical or major issues found.

Recommended actions before commit:
1. Decide on FIXME at propgraphcmds.c:1632 (index vs. allow seq scan)
2. Fix 5 documentation issues
3. Consider adding TEMP graph tests (optional but recommended)

Points for discussion (optional):
4. CREATE PROPERTY GRAPH IF NOT EXISTS support
5. Error vs. 0 rows behavior for non-existent/conflicting labels

Attachment:
- coverage_report.tar.gz (HTML coverage report generated by gcov)

---
End of Report

2025년 12월 30일 (화) PM 8:27, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>님이
작성:

> On Tue, Dec 30, 2025 at 3:14 PM Ashutosh Bapat
> <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> >
> > On Thu, Dec 18, 2025 at 2:45 PM Ashutosh Bapat
> > <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> wrote:
> > >
> > >
> > > >
> > > > I did another investigation about whether this level of checking is
> > > > necessary. I think according to the letter of the SQL standard, the
> > > > typmods must indeed match. It seems Oracle does not check (the
> example
> > > > mentioned above came from an Oracle source). But I think it's okay
> to
> > > > keep the check. In PostgreSQL, it is much less common to write like
> > > > varchar(1000). And we can always consider relaxing it later.
> > >
> > > +1.
> > >
> > > Attached patch adds a couple more test statements.
> > >
> >
> > Squashed this into the main patchset.
> >
> > > >
> > > > 2) I had it in my notes to consider whether we should support the
> colon
> > > > syntax for label expressions. I think we might have talked about
> that
> > > > before.
> > > >
> > > > I'm now leaning toward not supporting it in the first iteration. I
> > > > don't know that we have fully explored possible conflicts with host
> > > > variable syntax in ecpg and psql and the like. Maybe avoid that for
> now.
> > > >
> > >
> > > I was aware of ecpg and I vaguely remember we fixed something in ECPG
> > > to allow : in MATCH statement. Probably following changes in
> > > psqlscan.l and pgc.l
> > > -self [,()\[\].;\:\+\-\*\/\%\^\<\>\=]
> > > +self [,()\[\].;\:\|\+\-\*\/\%\^\<\>\=]
> > >
> > > Those changes add | after : (and not the : itself) so maybe they are
> > > not about supporting : . Do you remember what those are?
> >
> > I reverted those changes from both the files and ran "meson test". I
> > did not observe any failure. It seems those changes are not needed.
> > But adding them as a separate commit (0004) in case CI bot reveals any
> > failures without them.
> >
> > I noticed that there were no ECPG tests for SQL/PGQ. Added a basic
> > test in patch 0003.
> >
> > >
> > > I spotted some examples that use : in ddl.sgml.
> > > <programlisting>
> > > SELECT customer_name FROM GRAPH_TABLE (myshop MATCH
> > > (c:customer)-[:has]->(o:"order" WHERE o.ordered_when = current_date)
> > > COLUMNS (c.name AS customer_name));
> > > </programlisting>
> > >
> > > The query demonstrates that one can use label names in a way that will
> > > make the pattern look like an English sentence. Replacing : with IS
> > > defeats that purpose.
> > >
> > > As written in that paragraph, the labels serve the purpose of exposing
> > > the table with a different logical view (using different label and
> > > property names). So we need that paragraph, but I think we should
> > > change the example to use IS instead of :. Attached is suggested
> > > minimal change, but I am not happy with it. Another possibility is we
> > > completely remove that paragraph; I don't think we need to discuss all
> > > possible usages the users will come up with.
> > >
> > > The patch changes one more instance of : by IS. But that's straight
> forward.
> > >
> > > In ddl.sgml I noticed a seemingly incomplete sentence
> > > A property graph is a way to represent database contents, instead
> of using
> > > relational structures such as tables.
> > >
> > > Represent the contents as what? I feel the complete sentence should be
> > > one of the following
> > > property graph is a way to represent database contents as a graph,
> > > instead of representing those as relational structures OR
> > > property graph is another way to represent database contents instead
> > > of using relational structures such as tables
> > >
> > > But I can't figure out what was originally intended.
> >
> >
> > 0002 contains some edits to this part of documentation. I think the
> > paragraph reads better than before. Let me know what you think.
> >
> > Please let me know which of 0002 to 0004 look good to you. I will
> > squash those into the patchset in the next version.
>
> The previous patchset had a whitespace difference in ECPG expected
> output files. Fixed in the attached patchset.
>
> --
> Best Wishes,
> Ashutosh Bapat
>

Attachment Content-Type Size
coverage.tgz application/x-compressed 565.4 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo Nagata 2025-12-31 06:57:24 Re: Fix comments on _bt_skiparray_strat_increment/decrement
Previous Message jian he 2025-12-31 05:08:24 Re: CREATE TABLE LIKE INCLUDING TRIGGERS