Re: Adding OLD/NEW support to RETURNING

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Adding OLD/NEW support to RETURNING
Date: 2024-02-24 17:52:36
Message-ID: b1ab256b-d241-4226-a70b-658ff98790ec@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 12/4/23 13:14, Dean Rasheed wrote:
> I have been playing around with the idea of adding support for OLD/NEW
> to RETURNING, partly motivated by the discussion on the MERGE
> RETURNING thread [1], but also because I think it would be a very
> useful addition for other commands (UPDATE in particular).
>

Sounds reasonable ...

> This was discussed a long time ago [2], but that previous discussion
> didn't lead to a workable patch, and so I have taken a different
> approach here.
>

Presumably the 2013 thread went nowhere because of some implementation
problems, not simply because the author lost interest and disappeared?
Would it be helpful for this new patch to briefly summarize what the
main issues were and how this new approach deals with that? (It's hard
to say if reading the old thread is necessary/helpful for understanding
this new patch, and time is a scarce resource.)

> My first thought was that this would only really make sense for UPDATE
> and MERGE, since OLD/NEW are pretty pointless for INSERT/DELETE
> respectively. However...
>
> 1. For an INSERT with an ON CONFLICT ... DO UPDATE clause, returning
> OLD might be very useful, since it provides a way to see which rows
> conflicted, and return the old conflicting values.
>
> 2. If a DELETE is turned into an UPDATE by a rule (e.g., to mark rows
> as deleted, rather than actually deleting them), then returning NEW
> can also be useful. (I admit, this is a somewhat obscure use case, but
> it's still possible.)
>
> 3. In a MERGE, we need to be able to handle all 3 command types anyway.
>
> 4. It really isn't any extra effort to support INSERT and DELETE.
>
> So in the attached very rough patch (no docs, minimal testing) I have
> just allowed OLD/NEW in RETURNING for all command types (except, I
> haven't done MERGE here - I think that's best kept as a separate
> patch). If there is no OLD/NEW row in a particular context, it just
> returns NULLs. The regression tests contain examples of 1 & 2 above.
>
>
> Based on Robert Haas' suggestion in [2], the patch works by adding a
> new "varreturningtype" field to Var nodes. This field is set during
> parse analysis of the returning clause, which adds new namespace
> aliases for OLD and NEW, if tables with those names/aliases are not
> already present. So the resulting Var nodes have the same
> varno/varattno as they would normally have had, but a different
> varreturningtype.
>

No opinion on whether varreturningtype is the right approach - it sounds
like it's working better than the 2013 patch, but I won't pretend my
knowledge of this code is sufficient to make judgments beyond that.

> For the most part, the rewriter and parser are then untouched, except
> for a couple of places necessary to ensure that the new field makes it
> through correctly. In particular, none of this affects the shape of
> the final plan produced. All of the work to support the new Var
> returning type is done in the executor.
>
> This turns out to be relatively straightforward, except for
> cross-partition updates, which was a little trickier since the tuple
> format of the old row isn't necessarily compatible with the new row,
> which is in a different partition table and so might have a different
> column order.
>

So we just "remap" the attributes, right?

> One thing that I've explicitly disallowed is returning OLD/NEW for
> updates to foreign tables. It's possible that could be added in a
> later patch, but I have no plans to support that right now.
>

Sounds like an acceptable restriction, as long as it's documented.

What are the challenges for supporting OLD/NEW for foreign tables? I
guess we'd need to ask the FDW handler to tell us if it can support
OLD/NEW for this table (and only allow it for postgres_fdw with
sufficiently new server version), and then deparse the SQL.

I'm asking because this seems like a nice first patch idea, but if I
don't see some major obstacle that I don't see ...

>
> One difficult question is what names to use for the new aliases. I
> think OLD and NEW are the most obvious and natural choices. However,
> there is a problem - if they are used in a trigger function, they will
> conflict. In PL/pgSQL, this leads to an error like the following:
>
> ERROR: column reference "new.f1" is ambiguous
> LINE 3: RETURNING new.f1, new.f4
> ^
> DETAIL: It could refer to either a PL/pgSQL variable or a table column.
>
> That's the same error that you'd get if a different alias name had
> been chosen, and it happened to conflict with a user-defined PL/pgSQL
> variable, except that in that case, the user could just change their
> variable name to fix the problem, which is not possible with the
> automatically-added OLD/NEW trigger variables. As a way round that, I
> added a way to optionally change the alias used in the RETURNING list,
> using the following syntax:
>
> RETURNING [ WITH ( { OLD | NEW } AS output_alias [, ...] ) ]
> * | output_expression [ [ AS ] output_name ] [, ...]
>
> for example:
>
> RETURNING WITH (OLD AS o) o.id, o.val, ...
>
> I'm not sure how good a solution that is, but the syntax doesn't look
> too bad to me (somewhat reminiscent of a WITH-query), and it's only
> necessary in cases where there is a name conflict.
>
> The simpler solution would be to just pick different alias names to
> start with. The previous thread seemed to settle on BEFORE/AFTER, but
> I don't find those names particularly intuitive or appealing. Over on
> [1], PREVIOUS/CURRENT was suggested, which I prefer, but they still
> don't seem as natural as OLD/NEW.
>
> So, as is often the case, naming things turns out to be the hardest
> problem, which is why I quite like the idea of letting the user pick
> their own name, if they need to. In most contexts, OLD and NEW will
> work, so they won't need to.
>

I think OLD/NEW with a way to define a custom alias when needed seems
acceptable. Or at least I can't think of a clearly better solution. Yes,
using some other name might not have this problem, but I guess we'd have
to pick an existing keyword or add one. And Tom didn't seem thrilled
with reserving a keyword in 2013 ...

Plus I think there's value in consistency, and OLD/NEW seems way more
natural that BEFORE/AFTER.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2024-02-24 18:52:16 Re: Relation bulk write facility
Previous Message Noah Misch 2024-02-24 17:23:45 Re: Relation bulk write facility