Re: Problems with Error Messages wrt Domains, Checks

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: john frazer <johnfrazer783(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Problems with Error Messages wrt Domains, Checks
Date: 2018-03-17 15:20:37
Message-ID: CAKFQuwbFLvUiVOV4N7a1Fdh6hQthuLZCy-WwcQ6w5Cp6C4MRyw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Mar 17, 2018 at 6:14 AM, john frazer <johnfrazer783(at)gmail(dot)com>
wrote:

> Today I realized a number of points where PostgreSQL (v10.3) is rather
> lackluster and sparse in its error messages.
>
> ​You may find the following thread and its predecessors enlightening.

https://www.postgresql.org/message-id/CAD3a31WR8X1TpjR_MoZxuz4S0BO3ZkPAeLoQ9rPxKHG%3D728eoQ%40mail.gmail.com

​Basically, the fundamental problem is type input is performed in a
relatively isolated fashion since there is no requirement that a table or
column of said type even exist.

> psql:db/experiments/pg-error-fail-illegal-regex.sql:17: ERROR:
> invalid regular expression: parentheses () not balanced
>
> There are several problems with this error message:
>
> FAILURE: the error is really in line 5 where a syntactically invalid RegEx
> is created; the fact that it is a RegEx and not a general string is obvious
> from the semantics of the ~ (tilde) operator at that point in time.
>
> ​Yeah, the fact that we don't "compile" expressions is unfortunate. Not
sure if there are any plans to do so or what limitations there are​

> FAILURE: the offending RegEx is not referred to and not quoted in the
> error message.
>
​This seems like an easy enough oversight to correct.​ In all the
discussion about being challenging to identify location I don't recall
seeing anything about why we aren't at least showing the offending input
value.

> As such, it could be anywhere in my many, many kLOCs big DB definition. I
> cannot even search the RegEx with a RegEx because all I know is some
> parenthesis is missing, somewhere:
>
​Well, the error does point to the first statement in the chain of issues -
working backward a couple of steps is possible.

> RegExes cannot match parentheses,
>
​Sure they can​.

> and PG RegExes do not have a unique syntactic marker to them.
>
​True​

> FAILURE: before the insert statement, everything runs dandy. We could
> have built an entire data warehouse application on top of a table
> definition that can never be syntactically processed but which will only
> fail when someone accidentally tries to insert a line.
>
​Since this is going to fail every single time you add a record I'm lacking
sympathy here. "Accidentally tries to insert a line" - if the table wasn't
meant to be used why does it exist in the first place?​ And if it is
intended to be used then functional testing should quickly point out
something like this.

> FAILURE: I can select from a table with a syntactically invalid definition.
>
​You're stretching here if you think this is an important failure point.
Since the table cannot not have valid data there would be nothing to
select. Checking constraints during selection is undesireable - they
should be an are only checked during insertion or when the constraint
itself changes.

With only line B active, this gives:
>
> psql:db/experiments/pg-error-fail-no-constraint-name.sql:16:
> ERROR: new row for relation "table_with_constraints" violatescheck constraint "field b must have 3 characters"
> DETAIL: Failing row contains (xxxx).
>
> SUCCESS: we get the name of the relation *and* the name of the violated
> rule.
>
> SUCCESS: the offending piece of data is quoted.
>
> FAILURE: we don't get the full name of the relation, which is
> "X"."table_with_constraints". Neither do we get the name of the column that
> received the offending value.
>
​No, you get "check constraint field b must have 3 characters" with the
owning table. You've defined a table constraint so there is no directly
attached column to report - the expression as a whole fails and we don't
report which boolean aspects of the expression where true and false. You
do get the input value which makes manually resolving the expression
possible. The lack of schema-qualification on the table identifier seems
like an oversight but at the moment I'd not willing to go find evidence in
support to opposition to that thought.​

Lastly, with only line A (not line B) active:
>
> psql:db/experiments/pg-error-fail-no-constraint-name.sql:16:
> ERROR: value for domain x.a_legal_regex violates check constraint "a_legal_regex_check"
>
> FAILURE: no reference to the affected table, column is made.
>
FAILURE: no reference to the offending piece of data is made
>
​Repeat of the first example, same explanations apply. Hopefully this gets
improved eventually.​

> FAILURE: no reference to the offended constraint is made ("column a must
> start with x").
>
​It never got that far into the validation process. It couldn't even form
a value of correct type that the constraint expression could evaluate. I
suppose this is just another aspect of the first problem - the isolation of
type conversion and the absence of keeping and reporting a stack-trace.
Someone more knowledgeable than I would need to expound on the similarities
and differences.

> What are the best practices or workarounds for the above shortcomings?
> I've been trying for several hours to figure out what causes an error
> message a la value for domain xxx violates check constraint "xxx_check"
> by rewriting table definitions, inserting data row by row and so on, to no
> avail. What I need is a full chain of the objects (column -> table ->
> constraint -> domain -> check) that are involved in the error.
>
​If the error starts with "value for domain xxx violates" then the only
place to look is at your "create domain" statement for "xxx". Well, at
least once you know what the actually offending value is...which right now
might be a bit of a challenge depending on the situation (if your first
example applies the data doesn't matter and critically evaluating the
create domain statement might be sufficient).

> I'm writing this to the developers' list because I see the above
> observations as serious shortcomings in an otherwise great piece of
> software that can probably not be fixed by using client-side code only.
>

​I suppose the best practice when dealing with a lack of information in the
error handle code flows is to limit the amount of context that is in scope
by unit testing. And while they are absolutely short-comings overcoming
them has cost in terms of both developer effort and, more importantly,
runtime performance.

David J.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christos Maris 2018-03-17 16:12:42 Re: Google Summer of Code: Potential Applicant
Previous Message Tom Lane 2018-03-17 14:56:40 Re: strange failure in plpgsql_control tests (on fulmar, ICC 14.0.3)