Re: Allowing implicit 'text' -> xml|json|jsonb

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Marko Tiikkaja <marko(at)joh(dot)to>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Álvaro Hernández Tortosa <aht(at)nosys(dot)es>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allowing implicit 'text' -> xml|json|jsonb
Date: 2014-09-07 13:14:01
Message-ID: 540C5A19.20705@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/07/2014 02:24 AM, Tom Lane wrote:
>>> >> The problem here seems to be only related to mistyped parameters. Can
>>> >> we contain the damage to that part only somehow? Or make this optional
>>> >> (defaulting to off, I hope)?
>> > I'd love to make it affect only parameters, actually, for v3 protocol
>> > bind/parse/execute. That would be ideal.
> Well, let's talk about that. Doing something with parameter type
> assignment seems a lot less likely to result in unexpected side-effects
> than introducing a dozen new implicit casts.

I think it'd meet the needs of the group of users I see running into
issues and would minimise impact, so that sounds good if it's practical.

However, see below. It looks like just sending 'unknown' instead of
'text' for strings from drivers might be the way to go. My concerns
about introducing new overloads may have been unfounded.

>> > Right now the main workaround is to send all string-typed parameters as
>> > 'unknown'-typed, but that causes a mess with function overload
>> > resolution, and it's wrong most of the time when the parameter really is
>> > just text.
> If you think adding implicit casts *won't* cause a mess with function
> overload resolution, I wonder why.
>
> Really though it seems like the question is how much clarity there is
> on the client side about what data types parameters should have.
> I get the impression that liberal use of "unknown" is really about
> the right thing in a lot of client APIs ...

So we'd be going down the path of asking client drivers to change how
they bound string-type parameters to 'unknown' by default, or asking
users to routinely change that default.

Thinking about it some more, that's really no different to how things
work right now when you write unparameterised queries using literals
without an explicit type-specifier or cast. Pg will resolve an
unknown-typed literal to text if there's ambiguity and one of the
choices is a text-type.

e.g. with md5(text) vs md5(bytea), if you call it with an unknown-typed
literal the text form is chosen:

regress=> SELECT md5('abcdef');
md5
----------------------------------
e80b5017098950fc58aad83c8c14978e
(1 row)

same as if you bind an explicitly unknown-typed parameter:

regress=> PREPARE md5p(unknown) AS SELECT md5($1);
PREPARE
regress=> EXECUTE md5p('abcdef');
md5
----------------------------------
e80b5017098950fc58aad83c8c14978e
(1 row)

In fact, to my surprise, using 'unknown' won't break callers who
currently send an explicit 'text' type when there's a 'varchar' overload
of the function:

regress=> create or replace function identity(varchar) returns text
language plpgsql as $$ begin raise notice 'varchar'; return $1; end; $$;
CREATE FUNCTION
regress=> create or replace function identity(text) returns text
language plpgsql as $$ begin raise notice 'text'; return $1; end; $$;
CREATE FUNCTION

regress=> SELECT identity('fred');
NOTICE: text
identity
----------
fred
(1 row)

regress=> PREPARE identity_text(text) AS SELECT identity($1);
PREPARE
craig=> EXECUTE identity_text('fred');
NOTICE: text
identity
----------
fred
(1 row)

regress=> PREPARE identity_unknown(unknown) AS SELECT identity($1);
PREPARE
craig=> EXECUTE identity_unknown('fred');
NOTICE: text
identity
----------
fred
(1 row)

regress=> PREPARE identity_varchar(varchar) AS SELECT identity($1);
PREPARE
regress=> EXECUTE identity_varchar('fred');
NOTICE: varchar
identity
----------
fred
(1 row)

So - if a driver currently sends 'varchar' for string types, and the
user has a 'varchar' and 'text' overload of the same function defined,
it'd change the overload selected. That's a (tiny) BC break.

Perhaps the solution here is just to make 'unknown' the default for
stirng-types in client drivers, make sure people have a way to change it
back, and relnote it clearly in the driver release?

In PgJDBC that's just a matter of changing the default for 'stringtype'
to 'unknown' in the 9.4 release.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Костя Кузнецов 2014-09-07 14:11:08 gist vacuum gist access
Previous Message Andrew Gierth 2014-09-07 13:11:49 Re: Final Patch for GROUPING SETS - unrecognized node type: 347