ExecTypeSetColNames is fundamentally broken

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: ExecTypeSetColNames is fundamentally broken
Date: 2021-12-05 18:45:47
Message-ID: 2950001.1638729947@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Changing
Fcc: +inbox
--------
I looked into the failure reported at [1]. Basically what's happening
there is that we're allowing a composite datum of type RECORD to get
stored into a table, whereupon other backends can't make sense of it
since they lack the appropriate typcache entry. The reason the datum
is marked as type RECORD is that ExecTypeSetColNames set things up
that way, after observing that the tuple descriptor obtained from
the current table definition didn't have the column names it thought
it should have.

Now, in the test case as given, ExecTypeSetColNames is in error to
think that, because it fails to account for the possibility that the
tupdesc contains dropped columns that weren't dropped when the relevant
RTE was made. However, if the test case is modified so that we just
rename rather than drop some pre-existing column, then even with a
fix for that problem ExecTypeSetColNames would do the wrong thing.

I made several attempts to work around this problem, but I've now
concluded that changing the type OID in ExecTypeSetColNames is just
fundamentally broken. It can't be okay to decide that a Var that
claims to emit type A actually emits type B, and especially not to
do so as late as executor startup. I speculated in the other thread
that we could do so during planning instead, but that turns out to
just move the problems around. I think this must be so, because the
whole idea is bogus. For example, if we have a function that is
declared to take type "ct", it can't be okay in general to pass it
type "record" instead. We've mistakenly thought that we could fuzz
this as long as the two types are physically compatible --- but how
can we think that the column names of a composite type aren't a
basic part of its definition?

So 0001 attached fixes this by revoking the decision to apply
ExecTypeSetColNames in cases where a Var or RowExpr is declared
to return a named composite type. This causes a couple of regression
test results to change, but they are ones that were specifically
added to exercise this behavior that we now see to be invalid.
(In passing, this also adjusts ExecEvalWholeRowVar to fail if the
Var claims to be of a domain-over-composite. I'm not sure what
I was thinking when I changed it to allow that; the case should
not arise, and if it did, it'd be questionable because we're not
checking domain constraints here.)

0002 is some inessential mop-up that avoids storing useless column
name lists in RowExprs where they won't be used.

I'm not really thrilled about back-patching a behavioral change
of this sort, but I don't see any good alternative. Corrupting
people's tables is not okay.

Thoughts?

regards, tom lane

[1] https://www.postgresql.org/message-id/flat/CAOFAq3BeawPiw9pc3bVGZ%3DRint2txWEBCeDC2wNAhtCZoo2ZqA%40mail.gmail.com

Attachment Content-Type Size
0001-dont-change-named-rowtype-to-RECORD.patch text/x-diff 10.2 KB
0002-dont-attach-useless-colname-lists.patch text/x-diff 4.4 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2021-12-05 19:47:55 Re: enable certain TAP tests for MSVC builds
Previous Message Chapman Flack 2021-12-05 18:14:28 Re: The "char" type versus non-ASCII characters