Re: Named Operators

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Cc: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Gurjeet Singh <gurjeet(at)singh(dot)im>, Postgres Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Named Operators
Date: 2023-02-08 15:57:57
Message-ID: 3491355.1675871877@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> writes:
> On 12.01.23 14:55, Matthias van de Meent wrote:
>>> Matter of taste, I guess. But more importantly, defining an operator
>>> gives you many additional features that the planner can use to
>>> optimize your query differently, which it can't do with functions. See
>>> the COMMUTATOR, HASHES, etc. clause in the CREATE OPERATOR command.

>> I see. Wouldn't it be better then to instead make it possible for the
>> planner to detect the use of the functions used in operators and treat
>> them as aliases of the operator? Or am I missing something w.r.t.
>> differences between operator and function invocation?
>> E.g. indexes on `int8pl(my_bigint, 1)` does not match queries for
>> `my_bigint + 1` (and vice versa), while they should be able to support
>> that, as OPERATOR(pg_catalog.+(int8, int8)) 's function is int8pl.

> I have been thinking about something like this for a long time.
> Basically, we would merge pg_proc and pg_operator internally. Then, all
> the special treatment for operators would also be available to
> two-argument functions.

I had a thought about this ...

I do not think this proposal is going anywhere as-written.
There seems very little chance that we can invent a syntax that
is concise, non-ugly, and not likely to get blindsided by future
SQL spec extensions. Even if we were sure that, say, "{foo}"
was safe from spec interference, the syntax "a {foo} b" has
exactly nothing to recommend it compared to "foo(a,b)".
It's not shorter, it's not standard, it won't help any pre-existing
queries, and it can't use function-call features such as named
arguments.

As Matthias said, what we actually need is for the planner to be able
to optimize function calls on the same basis as operators. We should
tackle that directly rather than inventing new syntax.

We could go after that by inventing a bunch of new function properties
to parallel operator properties, but there is a simpler way: just
teach the planner to look to see if a function call is a call of the
underlying function of some operator, and if so treat it like that
operator. Right now that'd be an expensive lookup, but we could
remove that objection with an index on pg_operator.oprcode or a
single new field in pg_proc.

This approach does have a couple of shortcomings:

* You still have to invent an operator name, even if you never
plan to use it in queries. This is just cosmetic though.
It's not going to matter if the operator name is long or looks like
line noise, if you only need to use it a few times in setup DDL.

* We could not extend this to support index functions with more than
two arguments, a request we've heard once in awhile in the past.
Our answer to that so far has been "make a function/operator with
one indexed argument and one composite-type argument", which is a
bit of an ugly workaround but seems to be serviceable enough.

On the whole I don't think these shortcomings are big enough
to justify all the work that would be involved in attaching
operator-like optimization information directly to functions.
(To mention just one nontrivial stumbling block: do you really
want to invent "shell functions" similar to the shell-operator
hack? If not, how are you going to handle declaration of
commutator pairs?)

In the long run this might lead to thinking of pg_operator as
an extension of pg_proc in the same way that pg_aggregate is.
But we have not unified pg_aggregate into pg_proc, and I don't
think anyone wants to, because pg_proc rows are undesirably
wide already. There's a similar objection to attaching
optimization fields directly to pg_proc.

You could imagine some follow-on internal cleanup like trying
to unify FuncExpr and OpExpr into a single node type (carrying
a function OID and optionally an operator OID). But that need
not have any user-visible impact either; it'd mainly be good
for eliminating a lot of near-duplicate code.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-02-08 16:02:24 Re: OpenSSL 3.0.0 vs old branches
Previous Message Justin Pryzby 2023-02-08 15:56:44 Re: typos