Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Assignment of valid collation for SET operations on queries with UNKNOWN types.
Date: 2017-01-18 05:25:45
Message-ID: CAB7nPqRLg2aLDiDOPaXxCOZO+OUPZzUjOkW+msTs6ah88HKooA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jan 8, 2017 at 10:55 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> writes:
>> On Tue, Jan 3, 2017 at 5:57 PM, Rahila Syed <rahilasyed90(at)gmail(dot)com> wrote:
>>> Are you suggesting extending the patch to include coercing from unknown to
>>> text for all possible cases where a column of unknown type is being created?
>
>> I guess, that's what Tom is suggesting.
>
> Yes; I think the point is we should change the semantics we assume for
> "SELECT 'foo'", not only for views but across the board. There are plenty
> of non-view-related cases where it doesn't behave well, for example
>
> regression=# select * from
> (select 'foo' from generate_series(1,3)) ss order by 1;
> ERROR: failed to find conversion function from unknown to text
>
> Furthermore, it seems like a seriously bad idea for "SELECT 'foo'" to mean
> something different in the context of CREATE VIEW than it means elsewhere.

And this offers the same semantics for any DDL using SELECT's target
list to build the list of column's types, which is consistent.

> The trick is to not break cases where we've already hacked things to allow
> external influence on the resolved types of SELECT-list items. AFAICT,
> these cases are just INSERT/SELECT and set operations (eg unions):
>
> regression=# create table target (f1 int);
> CREATE TABLE
> regression=# explain verbose insert into target select '42' from generate_series(1,3);
> QUERY PLAN
>
> --------------------------------------------------------------------------------
> ---------
> Insert on public.target (cost=0.00..10.00 rows=1000 width=4)
> -> Function Scan on pg_catalog.generate_series (cost=0.00..10.00 rows=1000
> width=4)
> Output: 42
> Function Call: generate_series(1, 3)
> (4 rows)
>
> regression=# explain verbose select '42' union all select 43;
> QUERY PLAN
> ------------------------------------------------
> Append (cost=0.00..0.04 rows=2 width=4)
> -> Result (cost=0.00..0.01 rows=1 width=4)
> Output: 42
> -> Result (cost=0.00..0.01 rows=1 width=4)
> Output: 43
> (5 rows)
>
> In both of the above cases, we're getting an integer constant not a
> text or "unknown" constant, because the type information gets imported
> from outside the "SELECT".
>
> I spent some time fooling with this today and came up with the attached
> patch. I think this is basically the direction we should go in, but
> there are various loose ends:
>
> 1. I adjusted a couple of existing regression tests whose results are
> affected, but didn't do anything towards adding new tests showing the
> desirable results of this change (as per the examples in this thread).
>
> 2. I didn't do anything about docs, either, though maybe no change
> is needed. One user-visible change from this is that queries should
> never return any "unknown"-type columns to clients anymore. But I
> think that is not documented now, so maybe there's nothing to change.
>
> 3. We need to look at whether pg_upgrade is affected. I think we
> might be able to let it just ignore the issue for views, since they'd
> get properly updated during the dump and reload anyway. But if someone
> had an actual table (or matview) with an "unknown" column, that would
> be a pg_upgrade hazard, because "unknown" doesn't store like "text".
>
> 4. If "unknown" were marked as a pseudotype not base type in pg_type
> (ie, typtype "p" not "b") then the special case for it in
> CheckAttributeType could go away completely. I'm not really sure
> why it's "b" today --- maybe specifically because of this point that
> it's currently possible to create tables with unknown columns? But
> I've not looked at what all the implications of changing that might
> be, and anyway it could be left for a followon patch.
>
> 5. I noticed that the "resolveUnknown" arguments of transformSortClause
> and other functions in parse_clause.c could go away: there's really no
> reason not to just treat them as "true" always. But that could be
> separate cleanup as well.
>
> Anybody want to hack on those loose ends?

Ashutosh, Rahila, do you have plans to review things here?
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-01-18 05:30:38 Re: Password identifiers, protocol aging and SCRAM protocol
Previous Message Noah Misch 2017-01-18 05:23:56 Re: Password identifiers, protocol aging and SCRAM protocol