Re: Compatible defaults for LEAD/LAG

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Vik Fearing <vik(at)postgresfriends(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Compatible defaults for LEAD/LAG
Date: 2020-09-04 06:33:36
Message-ID: CAFj8pRD6vXsWgf2cBC1Ti-2dBBPr6yhcdNS5302f3wcypaoW7A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

po 31. 8. 2020 v 7:05 odesílatel Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
napsal:

>
>
> ne 30. 8. 2020 v 23:59 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:
>
>> Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
>> > This is nice example of usage of anycompatible type (that is consistent
>> > with other things in Postgres), but standard says something else.
>> > It can be easily solved with https://commitfest.postgresql.org/28/2081/
>> -
>> > but Tom doesn't like this patch.
>> > I am more inclined to think so this feature should be implemented
>> > differently - there is no strong reason to go against standard or
>> against
>> > the implementations of other databases (and increase the costs of
>> porting).
>> > Now the implementation is limited, but allowed behaviour is 100% ANSI.
>>
>> I don't particularly buy this argument. The case at hand is what to do
>> if we have, say,
>>
>> select lag(integer_column, 1, 1.2) over ...
>>
>> The proposed patch would result in the output being of type numeric,
>> and any rows using the default would show "1.2". The spec says that
>> the right thing is to return integer, and we should round the default
>> to "1" to make that work. But
>>
>> (1) I doubt that anybody actually writes such things;
>>
>> (2) For anyone who does write it, the spec's behavior fails to meet
>> the principle of least surprise. It is not normally the case that
>> any information-losing cast would be applied silently within an
>> expression.
>>
>
> postgres=# create table foo(a int);
> CREATE TABLE
> postgres=# insert into foo values(1.1);
> INSERT 0 1
>
> postgres=# create table foo(a int default 1.1);
> CREATE TABLE
> postgres=# insert into foo values(default);
> INSERT 0 1
> postgres=# select * from foo;
> ┌───┐
> │ a │
> ╞═══╡
> │ 1 │
> └───┘
> (1 row)
>
> It is maybe strange, but it is not an unusual pattern in SQL. I think it
> can be analogy with default clause
>
> DECLARE a int DEFAULT 1.2;
>
> The default value doesn't change a type of variable. This is maybe too
> stupid example. There can be other little bit more realistic
>
> CREATE OR REPLACE FUNCTION foo(a numeric, b numeric, ...
> DECLARE x int DEFAULT a;
> BEGIN
> ...
>
> I am afraid about performance - if default value can change type, then
> some other things can stop work - like using index.
>
> For *this* case we don't speak about some operations between two
> independent operands or function arguments. We are speaking about
> the value and about a *default* for the value.
>
>
>> So this deviation from spec doesn't bother me; we have much bigger ones.
>>
>
> ok, if it is acceptable for other people, I can accept it too - I
> understand well so it is a corner case and there is some consistency with
> other Postgres features.
>
> Maybe this difference should be mentioned in documentation.
>

I thought more about this problem, and I think so ANSI specification is
semantically fully correct - it is consistent with application of default
value elsewhere in SQL environment.

In this case the optional argument is not "any" expression. It is the
default value for some expression . In other cases we usually use forced
explicit cast.

Unfortunately we do not have good tools for generic implementation of this
situation. In other cases there the functions have special support in
parser for this case (example xmltable)

I see few possibilities how to finish and close this issue:

1. use anycompatible type and add note to documentation so expression of
optional argument can change a result type, and so this is Postgres
specific - other databases and ANSI SQL disallow this.
It is the most simple solution, and it is good enough for this case, that
is not extra important.

2. choose the correct type somewhere inside the parser - for these two
functions.

3. introduce and implement some generic solution for this case - it can be
implemented on C level via some function helper or on syntax level

CREATE OR REPLACE FUNCTION lag(a anyelement, offset int, default defval
a%type) ...

"defval argname%type" means for caller's expression "CAST(defval AS
typeof(argname))"

@3 can be a very interesting and useful feature, but it needs an agreement
and harder work
@2 this is 100% correct solution without hard work (but I am not sure if
there can be an agreement on this implementation)
@1 it is good enough for this issue without harder work and probably there
we can find an agreement simply.

Regards

Pavel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2020-09-04 06:52:56 Re: Change JOIN tutorial to focus more on explicit joins
Previous Message Craig Ringer 2020-09-04 06:13:23 Re: [PATCH] Detect escape of ErrorContextCallback stack pointers (and from PG_TRY() )