Rethinking plpgsql's assignment implementation

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Rethinking plpgsql's assignment implementation
Date: 2020-12-11 17:21:17
Message-ID: 4165684.1607707277@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

We've had complaints in the past about how plpgsql can't handle
assignments to fields in arrays of records [1], that is cases like

arrayvar[n].field := something;

and also complaints about how plpgsql can't handle assignments
to array slices [2], ie

arrayvar[m:n] := something;

As of commit c7aba7c14, we have another problem, namely that
plpgsql's subscripted assignment only works for regular arrays;
it won't work for other types that might define subscript
assignment handlers.

So I started to think about how to fix that, and eventually
decided that what we ought to do is nuke plpgsql's array-assignment
code altogether. The core code already has support for everything
we want here in the context of field/element assignments in UPDATE
commands; if we could get plpgsql to make use of that infrastructure
instead of rolling its own, we'd be a lot better off.

The hard part of that is that the core parser will only generate
the structures we need (FieldStores and assignment SubscriptingRefs)
within UPDATE commands. We could export the relevant functions
(particularly transformAssignmentIndirection); but that won't help
plpgsql very much, because it really wants to be able to run all this
stuff through SPI. That means we have to have SQL syntax that can
generate an expression of that form.

That led me to think about introducing a new statement, say

SET variable_name opt_indirection := a_expr

where opt_indirection is gram.y's symbol for "field selections and/or
subscripts". The idea here is that a plpgsql statement like

x[2].fld := something;

would be parsed using this new statement, producing an expression
that uses an assignment SubscriptingRef and a FieldStore operating
on a Param that gives the initial value of the array-of-composite
variable "x". Then plpgsql would just evaluate this expression and
assign the result to x. Problem solved.

This almost works as-is, modulo annoying parse conflicts against the
existing variants of SET. However there's a nasty little detail
about what "variable_name" can be in plpgsql: it can be either one or
two identifiers, since there might be a block label involved, eg

<<mylabel>> declare x int; begin mylabel.x := ...

Between that and the parse-conflict problem, I ended up
with this syntax:

SET n: variable_name opt_indirection := a_expr

where "n" is an integer literal indicating how many dot-separated names
should be taken as the base variable name. Another annoying point is
that plpgsql historically has allowed fun stuff like

mycount := count(*) from my_table where ...;

that is, after the expression you can have all the rest of an ordinary
SELECT command. That's not terribly hard to deal with, but it means
that this new statement has to have all of SELECT's other options too.

The other area that doesn't quite work without some kind of hack is
that plpgsql's casting rules for which types can be assigned to what
are far laxer than what the core parser thinks should be allowed in
UPDATE. The cast has to happen within the assignment expression
for this to work at all, so plpgsql can't fix it by itself. The
solution I adopted was just to invent a new CoercionContext value
COERCION_PLPGSQL, representing "use pl/pgsql's rules". (Basically
what that means nowadays is to apply CoerceViaIO if assignment cast
lookup doesn't find a cast pathway.)

A happy side-effect of this approach is that it actually makes
some cases faster. In particular I can measure speedups for
(a) assignments to subscripted variables and (b) cases where a
coercion must be performed to produce the result to be assigned.
I believe the reason for this is that the patch effectively
merges what had been separate expressions (subscripts or casts,
respectively) into the main result-producing expression. This
eliminates a nontrivial amount of overhead for plancache validity
checking, execution startup, etc.

Another side-effect is that the report of the statement in error
cases might look different. For example, in v13 a typo in a
subscript expression produces

regression=# do $$ declare x int[]; begin x[!2] = 43; end $$;
ERROR: operator does not exist: ! integer
LINE 1: SELECT !2
^
HINT: No operator matches the given name and argument type. You might need to add an explicit type cast.
QUERY: SELECT !2
CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment

With this patch, you get

regression=# do $$ declare x int[]; begin x[!2] = 43; end $$;
ERROR: operator does not exist: ! integer
LINE 1: SET 1: x[!2] = 43
^
HINT: No operator matches the given name and argument type. You might need to add an explicit type cast.
QUERY: SET 1: x[!2] = 43
CONTEXT: PL/pgSQL function inline_code_block line 1 at assignment

It seems like a clear improvement to me that the whole plpgsql statement
is now quoted, but the "SET n:" bit in front of it might confuse people,
especially if we don't document this new syntax (which I'm inclined not
to, since it's useless in straight SQL). On the other hand, the
"SELECT" that you got with the old code was confusing to novices too.
Maybe something could be done to suppress those prefixes in error
reports? Seems like a matter for another patch. We could also use
some other prefix --- there's nothing particularly magic about the
word "SET" here, except that it already exists as a keyword --- but
I didn't think of anything I liked better.

This is still WIP: I've not added any new regression test cases
nor looked at the docs, and there's more cleanup needed in plpgsql.
But it passes check-world, so I thought I'd put it out for comments.

regards, tom lane

[1] https://www.postgresql.org/message-id/A3691E98-CCA5-4DEB-B43C-92AD0437E09E%40mikatiming.de
[2] https://www.postgresql.org/message-id/1070.1451345954%40sss.pgh.pa.us

Attachment Content-Type Size
0001-rewrite-plpgsql-assignment-1.patch text/x-diff 31.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-12-11 17:22:55 Re: Clean up ancient test style
Previous Message Peter Eisentraut 2020-12-11 16:52:12 Clean up ancient test style