Re: Named Operators

From: Gurjeet Singh <gurjeet(at)singh(dot)im>
To: Vik Fearing <vik(at)postgresfriends(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Postgres Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Named Operators
Date: 2023-01-14 14:14:02
Message-ID: CABwTF4VEGbnZAiC8LX_MTc0Yw+co0g-T+KNt0SwdMtPkiHsJLA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 12, 2023 at 5:55 AM Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
> On Thu, 12 Jan 2023 at 11:59, Gurjeet Singh <gurjeet(at)singh(dot)im> wrote:
> > ... defining an operator
> > gives you many additional features that the planner can use to
> > optimize your query differently, which it can't do with functions. See
> > the COMMUTATOR, HASHES, etc. clause in the CREATE OPERATOR command.
>
> I see. Wouldn't it be better then to instead make it possible for the
> planner to detect the use of the functions used in operators and treat
> them as aliases of the operator? Or am I missing something w.r.t.
> differences between operator and function invocation?
>
> E.g. indexes on `int8pl(my_bigint, 1)` does not match queries for
> `my_bigint + 1` (and vice versa), while they should be able to support
> that, as OPERATOR(pg_catalog.+(int8, int8)) 's function is int8pl.

Such a feature would be immensely useful in its own right. But it's
also going to be at least 2 orders of magnitude (or more) effort to
implement, and to get accepted in the community. I'm thinking of
changes in planner, catalogs, etc.

On Thu, Jan 12, 2023 at 7:21 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> writes:
> > I'm -1 on the chosen syntax; :name: shadows common variable
> > substitution patterns including those of psql.
>
> Yeah, this syntax is DOA because of that. I think almost
> anything you might invent is going to have conflict risks.

I remember discussing this in a meeting with Joe Conway a few weeks
ago, when this was just a proposal in my head and I was just bouncing
it off him. And I remember pointing out that colons would be a bad
choice because of their use in psql; but for life of me I can't think
of a reason (except temporary memory loss) why I failed to consider
the psql conflict when implementing the feature. If only some test in
`make check` would have pointed out the mistake, I wouldn't have made
this obvious mistake.

> We could probably make it work by allowing the existing OPERATOR
> syntax to take things that look like names as well as operators,
> like
>
> expr3 OPERATOR(contains_all) expr4
>
> But that's bulky enough that nobody will care to use it.

+1. Although that'd be better for readers than the all-special-char
names, this format is bulky enough that you won't be able to convince
the query writers to bother using it. But if all other efforts fail,
I'll take this format over the cryptic ones any day.

> On the whole I don't see this proposal going anywhere.
> There's too much investment in the existing operator names,
> and too much risk of conflicts if you try to shorten the
> syntax.

I wouldn't give up on the idea, yet :-) See new proposal below.

On Thu, Jan 12, 2023 at 9:14 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Isaac Morland <isaac(dot)morland(at)gmail(dot)com> writes:
> > What about backticks (`)?
>
> Since they're already allowed as operator characters, you can't
> use them for this purpose without breaking existing use-cases.
>
> Even if they were completely unused, I'd be pretty hesitant to
> adopt them for this purpose because of the potential confusion
> for users coming from mysql.

Since when have we started caring for the convenience of users of
other databases?!! /s

> Pretty much the only available syntax space is curly braces,
> and I don't really want to give those up for this either.
> (One has to assume that the SQL committee has their eyes
> on those too.)

On Thu, Jan 12, 2023 at 9:45 AM Vik Fearing <vik(at)postgresfriends(dot)org> wrote:
> They are used in row pattern recognition.

I was very hopeful of using { }, and hoping that we'd beat the SQL
committee to it, so that they have to choose something else, if we
release this into the wild before them. But it seems that they beat us
to it long ago. (tangent: Reading some blog posts, I have to say I
loved the Row Pattern Recognition feature!)

Considering that there are almost no printable characters left in
1-255 ASCII range for us to choose from, I had to get creative; and I
believe I have found a way to make it work.

Unless the SQL committee has their eyes on a freestanding backslash \
character for something, I believe we can use it as a prefix for Named
Operators. Since the most common use of backslash is for escaping
characters, I believe it would feel natural for the users to use it as
described below.

New scheme for the named operators: \#foo That is, an identifier
prefixed with \# would serve as an operator name. psql considers \ to
be the start of its commands, but it wasn't hard to convince psql to
ignore \# and let it pass through to server.

I agree that an identifier _surrounded_ by the same token (e.g. #foo#)
or the pairing token (e.g. {foo}) looks better aesthetically, so I am
okay with any of the following variations of the scheme, as well:

\#foo\# (tested; works)
\#foo# (not tested; reduces ident length by 1)

We can choose a different character, instead of #. Perhaps \{foo} !

Attached is the v2 patch that supports \#foo style Named Operators.
Following is the SQL snippet to see what the usage looks like.

create operator \#add_point
(function = box_add, leftarg = box, rightarg = point);
create table test(a box);
insert into test values('((0,0),(1,1))'), ('((0,0),(2,1))');
select a as original, a \#add_point '(1,1)' as modified from test;
drop operator \#add_point(box, point);

Although we have never done it before, but by using backslash we
might be able to define new custom token types as well, if needed.

For those interested, I have couple of different branches with
named_operators* prefix in my Git fork [1] where I'm trying different
combinations.

[1]: https://github.com/gurjeet/postgres/branches

Best regards,
Gurjeet
http://Gurje.et

Attachment Content-Type Size
named_operators_v2.patch application/octet-stream 4.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2023-01-14 14:49:12 Re: Extracting cross-version-upgrade knowledge from buildfarm client
Previous Message Nitin Jadhav 2023-01-14 13:40:55 Re: Fix GUC_NO_SHOW_ALL test scenario in 003_check_guc.pl