Improving PL/Tcl's error context reports

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Improving PL/Tcl's error context reports
Date: 2024-06-05 17:42:30
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

While working on commit b631d0149, I got a bee in my bonnet about
how unfriendly PL/Tcl's error CONTEXT reports are:

* The context reports expose PL/Tcl's internal names for the Tcl
procedures it creates, which'd be fine if those names were readable.
But actually they're something like "__PLTcl_proc_NNNN", where NNNN
is the function OID. Not only is that unintelligible, but because
the OIDs aren't stable this forces us to disable display of the
CONTEXT lines in all of PL/Tcl's regression tests.

* The first line of the context report (almost?) always duplicates
the primary error message, which is redundant and not per our
normal reporting style.

So attached is a patch that attempts to improve this situation.

The key question is how to avoid including function OIDs in the
strings that will appear in the regression test outputs. The
answer I propose is to start with an internal name like
"__PLTcl_proc_NAME", where NAME is the function's normal SQL name,
and then append the OID only if that function name is not unique.
As long as we don't create test cases that involve throwing
errors from duplicatively-named functions, we can show the context
reports and still have stable regression outputs. I think this will
improve the user experience for regular users too.

PL/Tcl wants the internal names to be all-ASCII-alphanumeric,
which saves it from having to think about encoding conversion
or quoting when inserting those names into Tcl command strings.
What I did in the attached is to copy only ASCII alphanumerics
from the SQL name. Perhaps it's worth working harder but
I failed to get excited about that.

A few notes:

* To avoid unnecessarily appending the OID when a function is
redefined, I modified the logic to explicitly delete the old Tcl
command before checking for duplication. This is okay even if the
function is currently being evaluated, because Tcl's internal
reference counting prevents it from deleting the underlying code
object until it's done being executed. Really we were depending on
that reference counting to handle such cases already, but you wouldn't
have known it from our comments. I added a test case to demonstrate
explicitly that this works correctly.

* Sadly, pltcl_trigger.sql still has to suppress the context
reports. Although its function names are now stable, the reports
include trigger argument lists, which include numeric table OIDs
so they're unstable. I don't see a way to change that without
breaking API for user trigger functions.

* A hazard with this plan is that the regression tests' context
reports might turn out to be platform-dependent. I experimented
with Tcl 8.5 and 8.6 here and found one difference: the "missing
close-brace" error reported by our tcl_error() test case shows the
unmatched open-brace on one version but not the other. AFAICS the
point of that test is just to exercise some Tcl-detected error, not
necessarily that exact one, so I just modified the test case to cause
a different error. We might find additional problems once this patch
hits the buildfarm or gets out into the field.

I'll park this in the next CF.

regards, tom lane

Attachment Content-Type Size
better-pltcl-context-reports-v1.patch text/x-diff 29.5 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2024-06-05 17:45:48 Re: use CREATE DATABASE STRATEGY = FILE_COPY in pg_upgrade
Previous Message Matthias van de Meent 2024-06-05 17:28:42 Re: use CREATE DATABASE STRATEGY = FILE_COPY in pg_upgrade