Syntax error reporting (was Re: [PATCHES] syntax error position "CREATE FUNCTION" bug fix)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Syntax error reporting (was Re: [PATCHES] syntax error position "CREATE FUNCTION" bug fix)
Date: 2004-03-19 17:57:46
Message-ID: 25705.1079719066@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

[ moving thread to hackers ]

Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr> writes:
> However, I still stick with my "bad" simple idea because the simpler the
> better, and also because of the following example:
> ...
> psql> SELECT count_tup('pg_shadow');
> ERROR: syntax error at or near "FRM" at character 22
> CONTEXT: PL/pgSQL function "count_tup" line 4 at for over execute statement

> As you can notice, the extract is not in the submitted query, so there
> is no point to show it there.

Yeah. However, I dislike your solution because it confuses the cases of
a syntax error in the actually submitted query, and a syntax error in an
internally generated query. We should keep these cases clearly separate
because clients may want to do different things. For a syntax error in
the submitted input, what you probably want to do is edit and resubmit
the original query --- that's the case I was thinking about in saying
that a GUI client like pgadmin would want to set the editing cursor in
the original input window. But this action is nonsensical if the syntax
error is from a generated query. Perhaps the GUI client could be smart
enough to pop up a new window in which one could edit and resubmit the
erroneous function definition.

Even in psql's simplistic error handling, you want to distinguish the
two cases. There's no point in showing the entire original query; one
line worth of context is plenty. But you very probably do want to see
all of a generated query. So I don't want the backend sending back
error reports that look the same in both cases.

The original design concept for the 'P' (position) error field is that
it would be used to locate syntax errors in the *original query*, and
so its presence is a cue to the client code to go in the direction of
setting the editing cursor. (Note the protocol specification says
"index into the original query string".) We have in fact misimplemented
it, because it is being set for syntax errors in internally generated
queries too.

I was already planning to modify plpgsql to send back the full text of
generated queries when there is an error. My intention was to supply
this just as part of the CONTEXT stack, that is instead of your example
of

ERROR: syntax error at or near "FRM" at character 22
CONTEXT: PL/pgSQL function "count_tup" line 4 at for over execute statement

you'd get something like

ERROR: syntax error at or near "FRM" at character 22
CONTEXT: Executing command "SELECT COUNT(*) AS c FRM pg_shadow"
PL/pgSQL function "count_tup" line 4 at for over execute statement

However it might be better to invent a new error-message field that
carries just the text of the SQL command, rather than stuffing it into
CONTEXT. (This is similar to your original patch, but different in
detail because I'm envisioning sending back generated queries, never the
submitted query. Regurgitating the submitted query is just a waste of
bandwidth.) The plus side of that would be that it'd be easy to extract
for syntax-error highlighting. The minuses are that existing clients
would fail to print such a field (the protocol spec says to ignore
unknown fields), and that there is no good way to cope with nested
queries.

A possible compromise is to put the text of the generated SQL command
into a new field only if the error is a syntax error, and put it into
the CONTEXT stack otherwise. Syntax errors couldn't be nested so at
least that problem goes away. This seems a bit messy though.

The other thing to think about is whether we should invent a new field
to carry syntax error position for generated queries, rather than making
'P' do double duty as it does now. If we don't do that then we have to
change the protocol specification to reflect reality. In any case I
think it has to be possible to tell very easily from the error message
whether the 'P' position refers to the submitted query or a generated
query.

Comments anyone?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Swan 2004-03-19 18:43:43 Re: COPY formatting
Previous Message Bruce Momjian 2004-03-19 16:53:34 Re: [HACKERS] UnixWare/CVS Tip/initdb.c needs to use

Browse pgsql-patches by date

  From Date Subject
Next Message Bruce Momjian 2004-03-19 18:46:19 Re: [HACKERS] UnixWare/CVS Tip/initdb.c needs to use
Previous Message Tom Lane 2004-03-19 16:56:19 Re: [HACKERS] UnixWare/CVS Tip/initdb.c needs to use