Re: MySQL search query is not executing in Postgres DB

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter_e(at)gmx(dot)net>, Greg Stark <stark(at)mit(dot)edu>, Bruce Momjian <bruce(at)momjian(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, premanand <kottiprem(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: MySQL search query is not executing in Postgres DB
Date: 2012-08-30 13:13:08
Message-ID: CA+TgmobFyfWH1D9-rZUOg11pF2D19H4KZ6BHGotf_QFJm=dvDQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 29, 2012 at 6:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Well, I see your point about LPAD(), but the problem is how to tell
> the difference between a harmless cast omission and an actual mistake
> that the user will be very grateful if we point out. If we allow
> implicit casts to text in the general case in function/operator calls,
> we are definitely going to re-introduce a lot of room for mistakes.

I concede that point. :-)

> Upthread you were complaining about how we'd reject calls even when
> there was only one possible interpretation. I wonder whether there'd be
> any value in taking that literally: that is, allow use of assignment
> rules when there is, in fact, exactly one function with the right number
> of parameters visible in the search path. This would solve the LPAD()
> problem (at least as stated), and probably many other practical cases
> too, since I admit your point that an awful lot of users do not use
> function overloading. The max() example I mentioned earlier would not
> get broken since there's more than one max(), and in general it seems
> likely that cases where there's a real risk would involve overloaded
> names.

That's an interesting idea. I like it.

> The main downside I can see is that code that used to work is likely
> to stop working as soon as someone creates a potential overloading
> situation. Worse, the error message could be pretty confusing, since
> if you had been successfully calling f(smallint) with f(42), you'd get
> "f(integer) does not exist", not something like "f() is ambiguous",
> after adding f(float8) to the mix. This seems related to the confusing
> changes in regression test cases that I got in my experiments yesterday.

One thought I had when looking at those messages was that, in some
ways, the new messages were actually less confusing than the old
messages. I mean, if you try to call f(42) and you get f(integer) does
not exist, ok, you'll probably figure out that the issue is with the
argument type, since you most likely know that an f of some type does
in fact exist. But it would be even more clear if the error message
said, ok, so there is an f, but I'm not going to call it because the
argument types don't match closely enough. The distinction would be
even more useful if the function happens to be called snuffleupagus
rather than f, because then when you call snufleupagus(42.0), it'll
tell you "i know nothing about a function by that name" whereas when
you call snuffleupagus(42) it'll tell you "i know about a function by
that name, but not with those argument types". I've certainly
encountered this confusion before whilst debugging my own and other
people's databases: is it giving me that error because the function
doesn't exist, or because of an argument type mismatch?

> This may be sufficient reason to reject the idea, since the very last
> thing we need in this area is any degradation in the relevance of the
> error messages.
>
>> ... as long as I work for a company that helps
>> people migrate from other database systems, I'm not going to be able
>> to stop caring about this issue even in cases where I don't personally
>> think implicit casting is a good idea, because other people who are
>> not me have tens of thousands of lines of procedural code written for
>> those other systems and if you tell them they've got to go through and
>> add hundreds or thousands of casts before they can migrate, it tends
>> to turn them off. Maybe there's no perfect solution to that problem,
>> but the status quo is definitely not perfect either.
>
> Meh. I tend to think that a better solution to those folks' problem is
> a package of add-on casts that they could install for use with their
> legacy code; not dumbing down the system's error detection capability
> for everyone. Peter's original try at re-adding implicit text casts
> in that way didn't work very well IIRC, but maybe we could try harder.

Well, the big problem that you run into is that when you add casts,
you tend to create situations that the type system thinks are
ambiguous. A particular example of this is textanycat, anytextcat,
and plain old textcat. If you start adding casts, the system can get
confused about which one it's supposed to call in which situation.
The frustrating thing is that we don't really care. The only reason
why there are three different operators in the first place is because
we want to make sure that everything someone does will match one of
them. But then if something matches two of them, we error out
unnecessarily.

It would be nice to have a way to say "among this group of functions,
we don't care" or perhaps "among this group of functions, here is a
preference ordering; in case of doubt, pick the one with the highest
preference". But in some sense I feel that that isn't really solving
the problem, because the only reason those extra functions exist in
the first place is to work around the fact that sometimes the system
doesn't perform typecasts in situations where we wish it did. It's
almost like we should have a way to flag argument positions and say
"for this function, in this argument position, feel free to implicitly
cast to text". Then, for example, you'd only need one quote_literal()
rather than two.

This is possibly unsatisfying as well because there likely will be
disagreement about which functions ought to have that
implicit-casting-ok behavior and which ones should not. Maybe that'd
be OK anyway; people could always hack their local catalogs if need
be. But, all things being equal, it would be nice to design this in a
way where you could package up a bunch of behavior in the form of an
extension that users could either install or not install according to
the semantics that they wish to have. Your proposal of adding casts
is appealing from that perspective, but it'd work a lot better in
practice if we could eliminate the practice of having multiple
versions of functions with the same semantics just to elicit the
desired casting behavior.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2012-08-30 13:39:50 Re: Getting rid of cheap-startup-cost paths earlier
Previous Message David Fetter 2012-08-30 13:06:39 Re: MySQL search query is not executing in Postgres DB