Re: Idea: Avoid JOINs by using path expressions to follow FKs

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "Julien Rouhaud" <rjuju123(at)gmail(dot)com>
Cc: "Isaac Morland" <isaac(dot)morland(at)gmail(dot)com>, "Vik Fearing" <vik(at)postgresfriends(dot)org>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "Alvaro Herrera" <alvherre(at)alvh(dot)no-ip(dot)org>, "PostgreSQL Developers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Idea: Avoid JOINs by using path expressions to follow FKs
Date: 2021-03-31 19:32:09
Message-ID: 52b17e8e-41ab-4d00-a9a8-c37e03a97888@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 31, 2021, at 19:16, Julien Rouhaud wrote:
> On Wed, Mar 31, 2021 at 5:19 PM Joel Jacobson <joel(at)compiler(dot)org <mailto:joel%40compiler.org>> wrote:
> >
> > If using the -> notation, you would only need to manually
> > inspect the tables involved in the remaining JOINs;
> > since you could be confident all uses of -> cannot affect cardinality.
>
> Talking about that, do you have some answers to the points raised in
> my previous mail, which is how it's supposed to behave when a table is
> both join using your "->" syntax and a plain JOIN, how to join the
> same table multiple time using this new syntax, and how to add
> predicates to the join clause using this new syntax.

It's tricky, I don't see a good solution.

My original proposal aimed to improve syntax conciseness.
While this would be nice, I see much more potential value in Tom's idea
of somehow making use of foreign key constrain names.

Instead of trying to hack it into the <select list> part of a query,
maybe it's more fruitful to see if we can find a way to integrate it into the <from clause>.

Perhaps something along the lines of what Vik suggested earlier:
> FROM a JOIN b WITH a_b_fk

The problem I have with the above is "b" is redundant information,
since the foreign key is always between two specific tables,
and given "a" and "a_b_fk" we know we are joining "b".

I would prefer a new chainable binary operator.

Pavel raised some concerns about using "->" since used by the standard already,
but perhaps it is less of a problem when only used in the <from clause>?
Otherwise we could use something else entirely.

Here comes some ideas on <from clause> syntax.

With default foreign key constraint names:

SELECT DISTINCT customers.company_name
FROM order_details->order_details_product_id_fkey AS products
JOIN order_details->order_details_order_id_fkey->orders_customer_id_fkey AS customers
WHERE products.product_name = 'Chocolade';

In a PostgreSQL-only environment, foreign keys could be renamed:

ALTER TABLE orders RENAME CONSTRAINT orders_customer_id_fkey TO customers;
ALTER TABLE order_details RENAME CONSTRAINT order_details_order_id_fkey TO orders;
ALTER TABLE order_details RENAME CONSTRAINT order_details_product_id_fkey TO products;

Then we would get:

SELECT DISTINCT customers.company_name
FROM order_details->products
JOIN order_details->orders->customers
WHERE products.product_name = 'Chocolade';

Which would be the same thing as:

SELECT DISTINCT customers.company_name
FROM order_details
JOIN order_details->products
JOIN order_details->orders
JOIN orders->customers
WHERE products.product_name = 'Chocolade';

Type of join can be specified as well as aliases, just like normal:

SELECT DISTINCT c.company_name
FROM order_details AS od
JOIN od->products AS p
FULL JOIN od->orders AS o
LEFT JOIN o->customers AS c
WHERE p.product_name = 'Chocolade';

(FULL and LEFT join makes no sense in this example, but just to illustrate join types works just like normal)

I don't know how challenging this would be to integrate into the grammar though.
Here are some other ideas which might be easier to parse:

SELECT DISTINCT customers.company_name
FROM order_details->products
JOIN ON order_details->orders->customers
WHERE products.product_name = 'Chocolade';

SELECT DISTINCT customers.company_name
FROM order_details->products
JOIN USING order_details->orders->customers
WHERE products.product_name = 'Chocolade';

SELECT DISTINCT customers.company_name
FROM order_details->products
JOIN WITH order_details->orders->customers
WHERE products.product_name = 'Chocolade';

More syntax ideas?

Semantic ideas:

* When chaining, all joins on the chain would be made of the same type.
* To use different join types, you would write a separate join.
* All tables joined in the chain, would be accessible in the <select list>, via the names of the foreign key constraints.
* Only the last link on the chain can be given an alias. If you want to alias something in the middle, split the chain into two separate joins (, so that the one in the middle becomes the last one, which can then be given an alias.)

Thoughts?

/Joel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-03-31 19:34:22 Re: Crash in record_type_typmod_compare
Previous Message Tom Lane 2021-03-31 19:30:43 Re: Redundant errdetail prefix "The error was:" in some logical replication messages