Re: [BUG] temporary file usage report with extended protocol and unnamed portals

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Mircea Cadariu <cadariu(dot)mircea(at)gmail(dot)com>, Frédéric Yhuel <frederic(dot)yhuel(at)dalibo(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Benoit Lobréau <benoit(dot)lobreau(at)dalibo(dot)com>, Guillaume Lelarge <guillaume(dot)lelarge(at)dalibo(dot)com>, Pierrick Chovelon <pierrick(dot)chovelon(at)dalibo(dot)com>
Subject: Re: [BUG] temporary file usage report with extended protocol and unnamed portals
Date: 2025-09-26 02:05:15
Message-ID: CAA5RZ0tcK=F+GBVpdFKPPDZBpq=vbQ=qUgv2WdxyutGOYs8NFg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > After thinking about this a bit more, I found the test that breaks
> > with v12. It is a bind statement in an implicit transaction. The
> > portal will get dropped by the end of the transaction and will not
> > reach drop_unnamed_portal. So, v13 takes Frederic's original idea,
> > saves the pointer of debug_query_string, and resets it at the end of
> > DropPortal. I also enhanced the test coverage.
> >
> > Debugging also shows that the STATEMENT: log is formed while we are
> > in ExecutorEnd. We know that other extensions rely on
> > QueryDesc->sourceText in that hook (i.e., pg_stat_statements), so
> > this is another reason this approach appears safe, in addition to
> > what was mentioned here [0] about the MessageContext outliving the
> > portal.
> >
> > [0] https://www.postgresql.org/message-id/CAA5RZ0vB-h2pFimPSi72ObWfpRwKR5kaN9XWW17TOkLntC9svA%40mail.gmail.com
>
> Hmm. We are running in circles here, based on the argument that the
> drop of an unnamed portal happens when the next BIND command shows
> up, and the fact that it is an old protocol artifact that we have been
> leaving with for ages in the backend engine for efficiency.

Yes, and with the correct tests, we ultimately returned to the original idea:
temporarily switching the debug_query_string inside PortalDrop proved
to be the only viable solution. Although, this was also the approach that
Tom early-on was against due to the possibility of "The portal's query string
is potentially shorter-lived than the portal" [0], although the code
comments [1]
say otherwise. I have been able to find a test case that breaks due to
this concern.

> I do agree about the point that it may be better to encourage drivers
> to not use unnamed portals, but at the same time I don't recall
> anybody complaining about the fact that a statement string should
> absolutely match with the statement where a temporary file has been
> created and should be linked to it.

Tools like pgbadger show the wrong information with the current
output. I am surprised with the lack of complaints.

> Thinking about this problem in a twisted way, could there be an
> argument in favor of the existing logic as it is? It is true that the
> cleanup happens when the next bind query happens. So, in fact, one
> could also say that it makes sense to reflect when a temp file is
> cleaned up and what's the query being processed that does the cleanup.

One could possibly make this claim, based on the comment of what
the STATEMENT: log actually does.
```
/*
* If the user wants the query that generated this error logged, do it.
*/
if (check_log_of_query(edata))
{
log_line_prefix(&buf, edata);
appendStringInfoString(&buf, _("STATEMENT: "));
```
But, from a temporary file logging perspective, this is very confusing
for the user.

> In short, I would be inclined to do nothing here,

Maybe we document this behavior? [2]

> but I do see an
> argument in favor of the tests, particularly to track the fact that
> the unnamed portal drop happens in the next BIND query, and looking

+1

[0] https://www.postgresql.org/message-id/23969.1745083695%40sss.pgh.pa.us
[1] https://www.postgresql.org/message-id/CAA5RZ0vB-h2pFimPSi72ObWfpRwKR5kaN9XWW17TOkLntC9svA%40mail.gmail.com
[2] https://www.postgresql.org/docs/current/runtime-config-logging.html#GUC-LOG-TEMP-FILES

--
Sami

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2025-09-26 02:26:16 Re: plan shape work
Previous Message Tom Lane 2025-09-26 01:53:30 Re: plan shape work