Re: [bug+patch] Inserting DEFAULT into generated columns from VALUES RTE

From: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Mikhail Titov <mlt(at)gmx(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [bug+patch] Inserting DEFAULT into generated columns from VALUES RTE
Date: 2020-11-20 14:30:25
Message-ID: CAEZATCVaZNONSW1yEZsAe6U_-O0kqyPTv_N4_60ydd=ot7y79Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 6 Sept 2020 at 22:42, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> I think you'd be better off to make transformInsertStmt(), specifically
> its multi-VALUES-rows code path, check for all-DEFAULT columns and adjust
> the tlist itself. Doing it there might be a good bit less inefficient
> for very long VALUES lists, too, which is a case that we do worry about.
> Since that's already iterating through the sub-lists, you could track
> whether all rows seen so far contain SetToDefault in each column position,
> and avoid extra scans of the sublists. (A BitmapSet might be a convenient
> representation of that, though you could also use a bool array I suppose.)
>
> I do not care for what you did in rewriteValuesRTE() either: just removing
> a sanity check isn't OK, unless you've done something to make the sanity
> check unnecessary which you surely have not. Perhaps you could extend
> the initial scan of the tlist (lines 1294-1310) to notice SetToDefault
> nodes as well as Var nodes and keep track of which columns have those.
> Then you could cross-check that one or the other case applies whenever
> you see a SetToDefault in the VALUES lists.

That's not quite right because by the time rewriteValuesRTE() sees the
tlist, it will contain already-rewritten generated column expressions,
not SetToDefault nodes. If we're going to keep that sanity check (and
I think that we should), I think that the way to do it is to have
rewriteTargetListIU() record which columns it has expanded defaults
for, and pass that information to rewriteValuesRTE(). Those columns of
the VALUES RTE are no longer used in the query, so the sanity check
can be amended to ignore them while continuing to check the other
columns.

Incidentally, there is another way of causing that sanity check to
fail -- an INSERT ... OVERRIDING USER VALUE query with some DEFAULTS
in the VALUES RTE (but not necessarily all DEFAULTs) will trigger it.
So even if the parser completely removed any all-DEFAULT columns from
the VALUES RTE, some work in the rewriter would still be necessary.

> BTW, another thing that needs checking is whether a rule containing
> an INSERT like this will reverse-list sanely. The whole idea of
> replacing some of the Vars might not work so far as ruleutils.c is
> concerned. In that case I think we might have to implement this
> by having transformInsertStmt restructure the VALUES lists to
> physically remove the all-DEFAULT column, and adjust the target column
> list accordingly --- that is, make a parse-time transformation of
> INSERT INTO gtest0 VALUES (1, DEFAULT), (2, DEFAULT);
> into
> INSERT INTO gtest0(a) VALUES (1), (2);
> That'd have the advantage that you'd not have to hack up the
> rewriter at all.

I think it's actually easier to just do it all in the rewriter -- at
the point where we see that we're about to insert potentially illegal
values from a VALUES RTE into a generated column, scan it to see if
all the values in that column are DEFAULTs, and if so trigger the
existing code to replace the Var in the tlist with the generated
column expression. That way we only do extra work in the case for
which we're currently throwing an error, rather than for every query.
Also, I think that scanning the VALUES lists in this way is likely to
be cheaper than rebuilding them to remove a column.

Attached is a patch doing it that way, along with additional
regression tests that trigger both the original error and the
sanity-check error triggered by INSERT ... OVERRIDING USER VALUES. I
also added a few additional comments where I found the existing code a
little non-obvious.

I haven't touched the existing error messages. I think that's a
subject for a separate patch.

Regards,
Dean

Attachment Content-Type Size
fix-generated-cols-with-defaults-in-values-rte.patch text/x-patch 21.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2020-11-20 14:41:05 Re: Add session statistics to pg_stat_database
Previous Message Peter Eisentraut 2020-11-20 14:26:39 Re: Keep elog(ERROR) and ereport(ERROR) calls in the cold path