Inconsistency in plpgsql's error context reports

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Inconsistency in plpgsql's error context reports
Date: 2017-12-10 16:55:04
Message-ID: 31545.1512924904@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I noticed that since I put in commit 390d58135, buildfarm members
that use CLOBBER_CACHE_ALWAYS are failing the added test case
with diffs like

*** 5051,5057 ****
NOTICE: y = 2
ERROR: record "r" is not assigned yet
DETAIL: The tuple structure of a not-yet-assigned record is indeterminate.
! CONTEXT: PL/pgSQL function inline_code_block line 15 at RAISE
-- Check handling of conflicts between plpgsql vars and table columns.
set plpgsql.variable_conflict = error;
create function conflict_test() returns setof int8_tbl as $$
--- 5051,5058 ----
NOTICE: y = 2
ERROR: record "r" is not assigned yet
DETAIL: The tuple structure of a not-yet-assigned record is indeterminate.
! CONTEXT: SQL statement "SELECT r"
! PL/pgSQL function inline_code_block line 15 at RAISE
-- Check handling of conflicts between plpgsql vars and table columns.
set plpgsql.variable_conflict = error;
create function conflict_test() returns setof int8_tbl as $$

The reason for the difference turns out to be that in a CCA build,
we are forcing recreation of the cached plan for the simple expression
"r", so the error gets detected during parse analysis of "SELECT r"
rather than during execution of the expression. And because this all
mostly relies on SPI, we've gone through SPI_plan_get_cached_plan,
which inserts _SPI_error_callback into the error context stack.
That's where the first CONTEXT line is coming from.

There seem to be two ways we could look at this. One is that the
new test case just needs to be rejiggered to avoid unstable output
("\set VERBOSITY terse" would be the easiest way). But there is
also room to argue that it's bad that plpgsql produces error reports
that vary depending on the phase of the moon, which is pretty much
what this would look like in the field --- cache flushes will occur
unpredictably in most application environments. In that view, where
exec_eval_simple_expr() bypasses SPI but nonetheless "has to do some of
the things SPI_execute_plan would do", one of the things it ought to be
doing is setting up an error context stack entry to duplicate the one
that SPI_execute_plan would push.

I'm of mixed mind about whether this is a good idea. Adding more overhead
to exec_eval_simple_expr() isn't great, even though it'd be just a few
instructions. And in most cases the statement-level context line that
you get anyway ought to be sufficient to localize the problem. But it's
not too hard to imagine that a CONTEXT line that only shows up some of
the time could break somebody's error handling code. It's also arguable
that it's weird that you get different CONTEXT reports depending on
whether exec_simple_check_plan thinks the expression is simple or not,
eg

regression=# do $$ declare x int := 1;
begin raise notice '%', x/0;
end $$;
ERROR: division by zero
CONTEXT: PL/pgSQL function inline_code_block line 2 at RAISE

regression=# do $$ declare x int := 1;
begin raise notice '%', (select x/0);
end $$;
ERROR: division by zero
CONTEXT: SQL statement "SELECT (select x/0)"
PL/pgSQL function inline_code_block line 2 at RAISE

Thoughts?

regards, tom lane

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2017-12-10 19:46:08 Re: pl/perl extension fails on Windows
Previous Message james 2017-12-10 16:24:56 Re: Postgres with pthread