| From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
|---|---|
| To: | Peter Eisentraut <peter(at)eisentraut(dot)org> |
| Cc: | assam258(at)gmail(dot)com, Junwang Zhao <zhjwpku(at)gmail(dot)com>, zengman <zengman(at)halodbtech(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>, Ajay Pal <ajay(dot)pal(dot)k(at)gmail(dot)com>, Imran Zaheer <imran(dot)zhir(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: SQL Property Graph Queries (SQL/PGQ) |
| Date: | 2026-03-26 07:08:50 |
| Message-ID: | CAExHW5tR4O0vjeqTCPr2VB5pYjNYbJgbCBEQf63NtU5Pz1MiOQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Mar 25, 2026 at 2:11 PM Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
>
> On 20.03.26 14:42, Ashutosh Bapat wrote:
> >> We should either:
> >>
> >> (1) Support G041 fully -- both directions. Junwang's patch handles
> >> this cleanly with minimal code change, or
> >> (2) Not support G041 -- reject all non-local element pattern
> >> references, throwing an error for both directions.
> >>
> >> What do you think?
> >
> > We will support the cross variable references in future once we
> > support more path patterns leading to a more mature code in that area.
> > But it doesn't harm to support backward references which can be easily
> > supported now. I don't see the code to do so getting disturbed as it
> > matures. For the segmentation fault a simple fix to add graph_pattern
> > suffices for now. Let's see if Peter has a different opinion.
>
> I think we should not allow non-local references in either direction
> (until we implement feature G041), option 2 above. Allowing only
> backward references seems kind of an arbitrary implementation artifact.
>
Sorry, I didn't explain well earlier. It isn't arbitrary. It mimics
backward lateral references.
Consider following query
Query1
SELECT x1.a, g.* FROM x1, GRAPH_TABLE (myshop MATCH (x1 IS
customers)-[IS customer_orders]->(o IS orders WHERE o.order_id = x1.a)
COLUMNS (x1.name AS customer_name, x1.customer_id AS cid)) g;
x1 is a table referenced in from clause as well as an element pattern
variable in path pattern. If we don't allow backward (cross)
referencing, x1.a in the element pattern o resolves to the table x1's
column a, but x1.name resolves to element x1's property reference
name. So within the same graph pattern x1 may get resolved to two
different things, even though x1 was declared before using it. Is that
how we expect it to happen? Let's call this inconsistency1.
However, if we allow backward references, we resolve x1 to the table
column reference till the point where x1 appears in the path pattern
and then onwards resolve x1 to the element pattern variable. So x1.a
and x1.name are both resolved into a property of element x1. Ofc, in
this example x1.a will result in an error since there is no property
with name a in the property graph. But in case label "customers" also
exposes property a, x1.a will be resolved to a property reference -
applying resolution rule consistently everywhere.
The behaviour remains consistent even with a forward reference like
Query 2
SELECT x1.a, g.* FROM x1, GRAPH_TABLE (myshop MATCH (c IS customers
WHERE c.address = 'US' AND c.customer_id = x1.a)-[IS
customer_orders]->(x1 IS orders) COLUMNS (c.name AS customer_name,
c.customer_id AS cid, x1.order_id)) g;
x1 in element pattern c is resolved to table x1 since that's the only
known resolution so far. x1.order_id however gets resolved to order_id
property of label orders as by then there is a closer reference
available.
If we do not support backward references now, but start supporting it
later, the Query 1's semantics will change, breaking backward
compatibility. Let's call this "backward incompatibility 1".
If we do not support forward element pattern references, that will be
inline with the fact that we don't support forward lateral references.
Those products which do not support lateral with graph_table don't
have to worry about any of these problems when supporting cross
element references.
We have the following options
1. Don't support cross referencing element variables, but then we have
"inconsistency1".
2. Support backward referencing element variables, with a risk of
"backward incompatibility 1". I think the standard leaves room to not
support forward referencing at all while supporting backward
referencing
3. Support both backward and forward referencing now, with a risk that
it may become a hurdle when we support more advanced features in graph
pattern matching. Given that we are in an early phase of implementing
PGQ query support, avoiding such hurdles is better.
I am leaning towards 2 + not supporting forward referencing at all.
But in the attached patchset I have implemented option 1 as you
suggested. If you are fine with option 1, let's use that patch.
On Wed, Mar 25, 2026 at 3:04 PM Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
>
> On 25.03.26 10:12, Ashutosh Bapat wrote:
> > On Wed, Mar 25, 2026 at 2:27 PM Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
> >>
> >> This functionality that you can leave off empty vertex patterns at the
> >> beginning and end is part of the optional feature G047, which we don't
> >> claim to support. So we should reject these path patterns.
> >>
> >
> > Thanks for the clarification. What about the implicit empty vertex
> > patterns in the middle of the path? Should we reject these path
> > patterns also?
> Yes, I think so.
>
> The specification of feature G047 is:
>
> "... in conforming SQL language, any <edge pattern> shall be immediately
> preceded and followed by a <vertex pattern>."
>
> So if you write
>
> () -[]-> -[]-> ()
>
> then the first edge pattern is not followed immediately by a vertex
> pattern, and so it would be invalid without feature G047.
Implemented it this way in the attached patchset.
0001: is a small adjustment to make sure that we add an element
variable name to the graph table namespace only once. This isn't a
problem right now, but we will have to do that anyway in [1] and it
has some effect one 0002.
0002: prohibits cross element variable references within graph_table
clause. It adds a new member GraphTableParseState::cur_variable to
track the variable name of the current pattern being transformed. We
can not use llast() simply because a. pattern currently being
transformed may not have name, so llast will be misleading b. changes
in 0001. We may want to combine 0001 and 0002 when committing.
0003: prohibits consecutive element patterns of the same kind
0004: some cleanup remaining from
5282bf535e474dc2517f2e835d147420ae2144de. We may want to wait to
accumulate more cleanups.
--
Best Wishes,
Ashutosh Bapat
| Attachment | Content-Type | Size |
|---|---|---|
| v20260326-0004-Cleanup-and-other-cosmetic-fixes.patch | text/x-patch | 2.7 KB |
| v20260326-0001-Add-a-graph-pattern-variable-only-once.patch | text/x-patch | 1.3 KB |
| v20260326-0002-Cross-variable-references-in-graph-pattern.patch | text/x-patch | 8.1 KB |
| v20260326-0003-Consecutive-element-patterns-of-same-kind.patch | text/x-patch | 8.1 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nikhil Chawla | 2026-03-26 07:32:23 | Re: [PATCH] Add prepared_orphaned_transaction_timeout GUC |
| Previous Message | Michael Paquier | 2026-03-26 06:53:37 | Re: Adding locks statistics |