Re: Rethinking plpgsql's assignment implementation

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Rethinking plpgsql's assignment implementation
Date: 2020-12-11 18:29:22
Message-ID: CAFj8pRAy88aO_=ZqLPm4Y7xf+GmoqFeU7N-ep84Yfui6Ts1=KQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

It is great. I expected much more work.

pá 11. 12. 2020 v 18:21 odesílatel Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> napsal:

> We've had complaints in the past about how plpgsql can't handle
> assignments to fields in arrays of records [1], that is cases like
>
> arrayvar[n].field := something;
>
> and also complaints about how plpgsql can't handle assignments
> to array slices [2], ie
>
> arrayvar[m:n] := something;
>
> As of commit c7aba7c14, we have another problem, namely that
> plpgsql's subscripted assignment only works for regular arrays;
> it won't work for other types that might define subscript
> assignment handlers.
>
> So I started to think about how to fix that, and eventually
> decided that what we ought to do is nuke plpgsql's array-assignment
> code altogether. The core code already has support for everything
> we want here in the context of field/element assignments in UPDATE
> commands; if we could get plpgsql to make use of that infrastructure
> instead of rolling its own, we'd be a lot better off.
>
> The hard part of that is that the core parser will only generate
> the structures we need (FieldStores and assignment SubscriptingRefs)
> within UPDATE commands. We could export the relevant functions
> (particularly transformAssignmentIndirection); but that won't help
> plpgsql very much, because it really wants to be able to run all this
> stuff through SPI. That means we have to have SQL syntax that can
> generate an expression of that form.
>
> That led me to think about introducing a new statement, say
>

> SET variable_name opt_indirection := a_expr
>

SQL/PSM (ANSI SQL) defines SET var = expr

If you introduce a new statement - LET, then it can be less confusing for
users, and this statement can be the foundation for schema variables. With
this statement the implementation of schema variables is significantly
simpler.

Regards

Pavel

>
> where opt_indirection is gram.y's symbol for "field selections and/or
> subscripts". The idea here is that a plpgsql statement like
>
> x[2].fld := something;
>
> would be parsed using this new statement, producing an expression
> that uses an assignment SubscriptingRef and a FieldStore operating
> on a Param that gives the initial value of the array-of-composite
> variable "x". Then plpgsql would just evaluate this expression and
> assign the result to x. Problem solved.
>
> This almost works as-is, modulo annoying parse conflicts against the
> existing variants of SET. However there's a nasty little detail
> about what "variable_name" can be in plpgsql: it can be either one or
> two identifiers, since there might be a block label involved, eg
>
> <<mylabel>> declare x int; begin mylabel.x := ...
>
> Between that and the parse-conflict problem, I ended up
> with this syntax:
>
> SET n: variable_name opt_indirection := a_expr
>
> where "n" is an integer literal indicating how many dot-separated names
> should be taken as the base variable name. Another annoying point is
> that plpgsql historically has allowed fun stuff like
>
> mycount := count(*) from my_table where ...;
>
> that is, after the expression you can have all the rest of an ordinary
> SELECT command. That's not terribly hard to deal with, but it means
> that this new statement has to have all of SELECT's other options too.
>
> The other area that doesn't quite work without some kind of hack is
> that plpgsql's casting rules for which types can be assigned to what
> are far laxer than what the core parser thinks should be allowed in
> UPDATE. The cast has to happen within the assignment expression
> for this to work at all, so plpgsql can't fix it by itself. The
> solution I adopted was just to invent a new CoercionContext value
> COERCION_PLPGSQL, representing "use pl/pgsql's rules". (Basically
> what that means nowadays is to apply CoerceViaIO if assignment cast
> lookup doesn't find a cast pathway.)
>
> A happy side-effect of this approach is that it actually makes
> some cases faster. In particular I can measure speedups for
> (a) assignments to subscripted variables and (b) cases where a
> coercion must be performed to produce the result to be assigned.
> I believe the reason for this is that the patch effectively
> merges what had been separate expressions (subscripts or casts,
> respectively) into the main result-producing expression. This
> eliminates a nontrivial amount of overhead for plancache validity
> checking, execution startup, etc.
>
> Another side-effect is that the report of the statement in error
> cases might look different. For example, in v13 a typo in a
> subscript expression produces
>
> regression=# do $$ declare x int[]; begin x[!2] = 43; end $$;
> ERROR: operator does not exist: ! integer
> LINE 1: SELECT !2
> ^
> HINT: No operator matches the given name and argument type. You might
> need to add an explicit type cast.
> QUERY: SELECT !2
> CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment
>
> With this patch, you get
>
> regression=# do $$ declare x int[]; begin x[!2] = 43; end $$;
> ERROR: operator does not exist: ! integer
> LINE 1: SET 1: x[!2] = 43
> ^
> HINT: No operator matches the given name and argument type. You might
> need to add an explicit type cast.
> QUERY: SET 1: x[!2] = 43
> CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment
>
> It seems like a clear improvement to me that the whole plpgsql statement
> is now quoted, but the "SET n:" bit in front of it might confuse people,
> especially if we don't document this new syntax (which I'm inclined not
> to, since it's useless in straight SQL). On the other hand, the
> "SELECT" that you got with the old code was confusing to novices too.
> Maybe something could be done to suppress those prefixes in error
> reports? Seems like a matter for another patch. We could also use
> some other prefix --- there's nothing particularly magic about the
> word "SET" here, except that it already exists as a keyword --- but
> I didn't think of anything I liked better.
>
> This is still WIP: I've not added any new regression test cases
> nor looked at the docs, and there's more cleanup needed in plpgsql.
> But it passes check-world, so I thought I'd put it out for comments.
>
> regards, tom lane
>
> [1]
> https://www.postgresql.org/message-id/A3691E98-CCA5-4DEB-B43C-92AD0437E09E%40mikatiming.de
> [2] https://www.postgresql.org/message-id/1070.1451345954%40sss.pgh.pa.us
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-12-11 18:32:29 Re: Rethinking plpgsql's assignment implementation
Previous Message Bruce Momjian 2020-12-11 18:21:21 Re: Proposed patch for key managment