Re: Typmod associated with multi-row VALUES constructs

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Typmod associated with multi-row VALUES constructs
Date: 2016-12-06 01:59:42
Message-ID: CAKFQuwYXzBQNpH5L=AHJzOjOZCZSzRvF9qiA0wwt_QZmAuYmEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 5, 2016 at 6:36 PM, Kyotaro HORIGUCHI <
horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:

> Hello,
>
> At Mon, 5 Dec 2016 14:42:39 -0700, "David G. Johnston" <
> david(dot)g(dot)johnston(at)gmail(dot)com> wrote in <CAKFQuwZXyyPLaO0wyn94WihcjZCUs
> v8nr0FsCFrQ=oO1DkpBuA(at)mail(dot)gmail(dot)com>
> > On Mon, Dec 5, 2016 at 2:22 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > > "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
> > > > On Mon, Dec 5, 2016 at 1:08 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > >> In order to fix this, we first have to decide what the semantics
> ought
> > > >> to be. I think there are two plausible definitions:
> > > >> 1. If all the expressions in the VALUES column share the same
> typmod,
> > > >> use that typmod, else use -1.
> > > >> 2. Use -1 whenever there is more than one VALUES row.
> > >
> > > > ​Can we be precise enough to perform #2 if the top-level (or
> immediate
> > > > parent) command is an INSERT - the existing table is going to
> enforce its
> > > > own typemod anyway, otherwise go with #1?
> > >
> > > I dunno if that's "precise" or just "randomly inconsistent" ;-)
> > >
> >
> > :)
> >
> > How does "targeted optimization" sound?
>
> (sorry I don't understand what the "targetted optimization" is..)
>

I guess its a bit misleading depending on the implementation chosen. The
general thought is that we simply ignore typemod information in VALUES if
we have been told otherwise what that typemod will be (in this case an
insert into column will use the typemod of the existing column regardless
of the input data's typemod).

> FWIW, different from the UNION case, I don't see a reason that
> every row in a VALUES clause shares anything common with any
> other rows. Of course typmod and even type are not to be
> shared. (Type is shared, though.)
>

​You have a typo here somewhere..."type not to be shared. (Type is shared,
though)" doesn't work.​

>
> On the other hand, if we make all values to be strictly typed (I
> mean that every value brings its own type information along
> with), values also can consider strict type. But currently the
> following command is ignoring the type of the first value.
>
> =# select 'bar'::varchar(4) || 'eeee';
> ?column?
> ----------
> bareeee
>
>
​Its not ignored - is discarded during coercion to make the || operator
work on the same type (text). ​

​SELECT '12345'::varchar(3) || '678'​

> Even though I'm not sure about SQL standard here but my
> feeling is something like the following.
>
> | FROM (
> | VALUES (row 1), .. (row n))
> | AS foo (colname *type*, ..)
>
> for this case,
>
> | create temporary table product_codes as select *
> | from (
> | values
> | ('abcdefg'),
> | ('012345678901234567ABCDEFGHIJKLMN')
> | ) csv_data (product_code character varying(20));
>
> Myself have gotten errors for this false syntax several times:(
>

​Only the function in from form of this accepts a column definition in lieu
of a simple alias. Regardless of the merits of this idea it would not be
backpatch-able and wouldn't resolve the current valid syntax problem.​

I suppose my option #4 has the same problem...

> > ​Unnecessary maybe, but wouldn't it be immaterial given we are only able
> to
> > be efficient when inserting exactly one row.
> >
> > There is also a #4 here to consider - if the first (or any) row is not
> type
> > unknown, and the remaining rows are all unknown, use the type and typemod
> > of the known row AND attempt coerce all of the unknowns to that same
> type.
> > I'd suggest this is probably the most user-friendly option (do as I mean,
> > not as I say). The OP query would then fail since the second literal is
> > too long to fit in a varchar(20) - I would not want the value truncated
> so
> > an actual cast wouldn't work.
>
> 1 has a type of int,

​true​

1.0 has a type of float

​false - its numeric​
​select pg_typeof(1.0) -> numeric

and '1' has a typ
> ​e ​
> of text.

​false - its unknown​ (but implicitly cast-able)
select pgtype_of('1') -> error, pg_typeof(unknown) does not exist

So I don't see a situation where only the first row is
> detectably typed. Or is it means that only the first row is
> explicitly typed?

​I do indeed mean explicitly typed here.​ Using 1 or 1.0 in a values would
be a form of explicit typing in this sense.

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-12-06 02:20:11 Re: [COMMITTERS] pgsql: Add support for restrictive RLS policies
Previous Message Craig Ringer 2016-12-06 01:46:17 Re: Typmod associated with multi-row VALUES constructs