Re: [PATCH] PREPARE TRANSACTION unexpected behavior with TEMP TABLE

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Himanshu Upadhyaya <upadhyaya(dot)himanshu(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] PREPARE TRANSACTION unexpected behavior with TEMP TABLE
Date: 2021-04-14 13:46:43
Message-ID: CA+TgmoaQHd=Ln2_QoHLGpg8ibu5F3HjVOZTF3da3adSN6sEemQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 14, 2021 at 6:45 AM Himanshu Upadhyaya
<upadhyaya(dot)himanshu(at)gmail(dot)com> wrote:
> The purpose of this FIX is to mainly focus on getting consistent behavior
> with PREPARE TRANSACTION. With the case that I had mentioned
> previously, my expectation was either both PREPARE TRANSACTION should fail
> or both should succeed but here second same "PREPARE TRANSACTION" was
> successful however first one was failing with an error, which is kind of weird to me.

I agree that it's weird, but that doesn't mean that your patch is an
improvement, and I don't think it is. If we could make this thing more
consistent without incurring any negatives, I'd be in favor of that.
But the patch does have some negatives, which in my opinion are more
substantial than the problem you're trying to fix. Namely, possible
performance consequences, and undocumented and fragile assumptions
that, as it seems to me, may easily get broken in the future. I see
that you've repeatedly capitalized the word FIX in your reply, but
it's just not that simple. If this had really bad consequences like
corrupting data or crashing the server then it would be essential to
do something about it, but so far the worst consequence you've
indicated is that an obscure sequence of SQL commands that no real
user is likely to issue produces a slightly surprising result. That's
not great, but neither is it an emergency.

> I have also tried to reproduce the behavior.

Your test case isn't ideal for reproducing the problem that the
comment is worrying about. Normally, when we take a lock on a table,
we hold it until commit. But, that only applies when we run a query
that mentions the table, like a SELECT or an UPDATE. In your case, we
only end up opening the table to build a relcache entry for it, so
that we can look at the metadata. And, catalog scans used to build
syscache and relcache entries release locks immediately, rather than
waiting until the end of the transaction. So it might be that if we
failed to ever set XACT_FLAGS_ACCESSEDTEMPNAMESPACE in your test case,
everything would be fine.

That doesn't seem to be true in general though. I tried changing
"cannot PREPARE a transaction that has operated on temporary objects"
from an ERROR to a NOTICE and then ran 'make check'. It hung. I think
this test case is the same problem as the regression tests hit; in any
case, it also hangs:

rhaas=# begin;
BEGIN
rhaas=*# create temp table first ();
CREATE TABLE
rhaas=*# prepare transaction 'whatever';
NOTICE: cannot PREPARE a transaction that has operated on temporary objects
PREPARE TRANSACTION
rhaas=# create temp table second ();
[ hangs ]

I haven't checked, but I think the problem here is that the first
transaction had to create this backend's pg_temp schema and the second
one can't see the results of the first one doing it so it wants to do
the same thing and that results in waiting for a lock the prepared
transaction already holds. I made a quick attempt to reproduce a hang
at backend exit time, but couldn't find a case where that happened.
That doesn't mean there isn't one, though. There's a very good chance
that the person who wrote that comment knew that a real problem
existed, and just didn't describe it well enough for you or I to
immediately know what it is. It is also possible that they were
completely wrong, or that things have changed since the comment was
written, but we can't assume that without considerably more research
and analysis than either of us has done so far.

I think one point to take away here is that question of whether a
temporary relation has been "accessed" is not all black and white. If
I ran a SELECT statement against a relation, I think it's clear that
I've accessed it. But, if I just used the name of that relation as a
type name in some other SQL command, did I really access it? The
current code's answer to that is that if we had to open and lock the
relation to get its metadata, then we accessed it, and if we already
had the details that we needed in cache, then we did not access it.
Now, again, I agree that looks a little weird from a user point of
view, but looking at the implementation, you can kind of see why it
ends up like that. From a certain point of view, it would be more
surprising if we never opened or locked the relation and yet ended up
deciding that we'd accessed it. Now maybe we should further explore
going the other direction and avoiding setting the flag at all in this
case, since I think we're neither retaining a lock nor touching any
relation buffers, but I think that needs more analysis. Even if we
decide that's safe, there's still the problem of finding a better
implementation that's not overly complex for what really feels like a
very minor issue.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-04-14 13:49:09 Re: sepgsql logging
Previous Message tanghy.fnst@fujitsu.com 2021-04-14 13:34:11 RE: Support tab completion for upper character inputs in psql