Re: Typmod associated with multi-row VALUES constructs

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Typmod associated with multi-row VALUES constructs
Date: 2016-12-06 04:54:08
Message-ID: CAKFQuwYOeesXJ1bH31b2MKx1UStEzrpYe=tSAO2-eg1Ai4=Eww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

feel free to s/typemod/typmod/ ... my fingers don't want to drop the "e"

On Mon, Dec 5, 2016 at 9:17 PM, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> Hello,
>
> At Mon, 5 Dec 2016 18:59:42 -0700, "David G. Johnston" <
> david(dot)g(dot)johnston(at)gmail(dot)com> wrote in <CAKFQuwYXzBQNpH5L=AHJzOjOZCZS
> zRvF9qiA0wwt_QZmAuYmEA(at)mail(dot)gmail(dot)com>
> > On Mon, Dec 5, 2016 at 6:36 PM, Kyotaro HORIGUCHI <
> > horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> >
>
> (It's not typo but my poor wording... Sorry.)
> Mmm. I think the typemod of a row should not be applied onto
> other rows, and the same can be said for types. But type of the
> first row is applied to all of the rest rows, though... Does it
> make sense?
>

Yes. ​All rows in a given relation must end up with the exact same type
and it isn't friendly to fail when a implicit conversion from unknown is
possible.

> But what I wanted to say was not that but the something like the
> following.
>
> select pg_typeof('12345'::varchar(1));
> pg_typeof
> -------------------
> character varying
>
> A value seemingly doesn't have typmod. So it seems reasonable
> that VALUES cannot handle typmod. It seems reasonable regardless
> of how it is acutually implemented.
>

​This is an artifact of functions - the typemod associated with the value
'12345' is lost when that value is passed into the function pg_typeof.
Thus it is impossible to write a SQL query the reports the typemod of a
literal or column reference. Nonetheless it is there in reality. Just see
the original CREATE TABLE AS example for proof. The created table's column
is shown (using direct catalog queries) to contain typemod value of ​20 -
which it could only have gotten from the first values rows which contained
a casted literal.

> > > Even though I'm not sure about SQL standard here but my
> > > feeling is something like the following.
> > >
> > > | FROM (
> > > | VALUES (row 1), .. (row n))
> > > | AS foo (colname *type*, ..)
> > >
> > > for this case,
> > >
> > > | create temporary table product_codes as select *
> > > | from (
> > > | values
> > > | ('abcdefg'),
> > > | ('012345678901234567ABCDEFGHIJKLMN')
> > > | ) csv_data (product_code character varying(20));
> > >
> > > Myself have gotten errors for this false syntax several times:(
> > >
> >
> > ​Only the function in from form of this accepts a column definition in
> lieu
> > of a simple alias. Regardless of the merits of this idea it would not be
> > backpatch-able and wouldn't resolve the current valid syntax problem.​
>
> Yeah.. It wouldn't be back-patchable. I personally think that it
> is not necessary to be "solve"d, since a value doesn't rigged
> with typmod.
>
> But I undetstand what we want to solve here is how to change
> VALUES's behavior to perfectly coerce the row values to the types
> explicity applied in the first row. Right?
>

​Actually, no, since it is not possible to coerce "perfectly". Since any
attempt at doing so could fail it is only possible to scan every row and
compare its typemod to the first row's typemod. Ideally (but maybe not in
reality) as soon as a discrepancy is found stop and discard the typemod.
If every row passes you can retain the typemod. That is arguably the
​perfect solution. The concern is that "scan every row" could be very
expensive - though in writing this I'm thinking that you'd quickly find a
non-match even in a large dataset - and so a less perfect but still valid
solution is to simply discard the typemod if there is more than one row.
My thought was that if you are going to discard typemod in the n > 1 case
for consistency you should discard the typemod in the n = 1 case as well.

That is, in a nutshell, options 1, 2, and 3 in order.

The "fault" in #1 that #4 attempted to fix was that VALUES are often hand
entered and so the inexperienced would like to only type in a cast one the
first row and have it will apply to all subsequent rows. That would be a
feature request, though. Your capability to add type information to any
FROM alias is likewise a feature request - solving the overall problem by
giving the author a place to specify the desired type explicitly without
having to pollute the query with excessive casts or subqueries.

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2016-12-06 04:54:55 Re: Adding in docs the meaning of pg_stat_replication.sync_state
Previous Message Joseph Brenner 2016-12-06 04:53:41 Re: Select works only when connected from login postgres