Re: plpgsql_check_function - rebase for 9.3

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: plpgsql_check_function - rebase for 9.3
Date: 2013-03-26 03:14:25
Message-ID: 27661.1364267665@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> Where are we with this patch? I'm a bit unclear from the discussion on
> whether it passes muster or not. Things seem to have petered out.

I took another look at this patch tonight. I think the thing that is
bothering everybody (including Pavel) is that as submitted, pl_check.c
involves a huge amount of duplication of knowledge and code from
pl_exec.c, and to a lesser extent pl_comp.c. It certainly looks like a
maintenance disaster in the making. It doesn't bother me so much that
pl_check.c knows about each syntactic structure in plpgsql: there are
already four or five places you have to touch when adding new syntax.
Rather, it's that there's so much duplication of knowledge about
semantic details, which is largely expressed by copied-and-pasted code
from pl_exec.c. It seems like a safe bet that we'll sometimes miss the
need to fix pl_check.c when we fix something in pl_exec.c. There are
annoying duplications from pl_comp.c as well, eg knowledge of exactly
which magic variables are defined in trigger functions.

Having said all that, it's not clear that we can really do any better.
The only obvious alternative is pushing support for a checking mode
directly into pl_exec.c, which would obfuscate and slow down that code
to an unacceptable degree if Pavel's results at
http://www.postgresql.org/message-id/CAFj8pRAKuJmVjPjzfSryE7+uB8jF8Wtz5rkxK-0ykXme-k81kA@mail.gmail.com
are any indication. (In that message, Pavel proposes shoveling the
problem into the compile code instead, but that seems unlikely to work
IMO: the main problem in pl_check.c as it stands is duplication of code
from pl_exec.c not pl_comp.c. So I think that path could only lead to
duplicating the same code into pl_comp.c.)

So question 1 is do we want to accept that this is the implementation
pathway we're going to settle for, or are we going to hold out for a
better one? I'd be the first in line to hold out if I had a clue how
to move towards a better implementation, but I have none. Are we
prepared to reject this type of feature entirely because of the
code-duplication problem? That doesn't seem attractive either.

But, even granting that this implementation approach is acceptable,
the patch does not seem close to being committable as-is: there's
a lot of fit-and-finish work yet to be done. To make my point I will
just quote from one of the regression test additions:

create or replace function f1()
returns void as $$
declare a1 int; a2 int;
begin
select 10,20 into a1;
end;
$$ language plpgsql;
-- raise warning
select plpgsql_check_function('f1()');
plpgsql_check_function
-------------------------------------------------------------------------
warning:00000:4:SQL statement:too many attributies for target variables
Detail: There are less target variables than output columns in query.
Hint: Check target variables in SELECT INTO statement
(3 rows)

Do we like this output format? I don't. The unlabeled, undocumented
fields crammed into a single line with colon separators are neither
readable nor useful. If we actually need these fields, why aren't we
splitting the output into multiple columns? (I'm also wondering why
the patch bothers with an option to emit this same info in XML. Surely
there is vanishingly small use-case for mechanical examination of this
output.)

This example also shows that the user-visible messages have had neither
editing from a native speaker of English, nor even attention from a
spell checker. I don't fault Pavel for that (his English is far better
than my Czech) but this patch has been passed by at least one reviewer
who is a native speaker. Personally I'd also say that neither the
Detail nor Hint messages in this example are worth the electrons they
take up, as they add nothing useful to the basic error message. I'd be
embarrassed to be presenting output like this to end users of Postgres.

(The code comments are in even worse shape than the user-visible
messages, by the by, where they exist at all.)

A somewhat related point is that it doesn't look like any thought
has been taken about localized message output.

This stuff is certainly all fixable if we agree on the basic
implementation approach; and I can't really fault Pavel for not
worrying much about such details while the implementation approach
hasn't been agreed to. But I'd judge that there's a week or more's
worth of work here to get to a patch I'd be happy about committing,
even without any change in the basic approach. That's not time I'm
willing to put in personally right now.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2013-03-26 04:29:48 Re: pg_upgrade segfaults when given an invalid PGSERVICE value
Previous Message Jeff Davis 2013-03-26 01:48:22 Re: Limiting setting of hint bits by read-only queries; vacuum_delay