Re: Doing better at HINTing an appropriate column within errorMissingColumn()

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Date: 2014-03-28 08:00:29
Message-ID: CAFj8pRCQChC8dOW4-Ry=csAoCcjGHFHkoAJYfcKSe3aLyiDfyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

I see only one risk - it can do some slowdown of exception processing.

Sometimes you can have a code like

BEGIN
WHILE ..
LOOP
BEGIN
INSERT INTO ...
EXCEPTION WHEN ..
; /* ignore this error */
END;
END LOOP;

without this risks, proposed feature is nice, but should be fast

Regards

Pavel

2014-03-27 20:10 GMT+01:00 Peter Geoghegan <pg(at)heroku(dot)com>:

> With the addition of LATERAL subqueries, Tom fixed up the mechanism
> for keeping track of which relations are visible for column references
> while the FROM clause is being scanned. That allowed
> errorMissingColumn() to give a more useful error to the one produced
> by the prior coding of that mechanism, with an errhint sometimes
> proffering: 'There is a column named "foo" in table "bar", but it
> cannot be referenced from this part of the query'.
>
> I wondered how much further this could be taken. Attached patch
> modifies contrib/fuzzystrmatch, moving its Levenshtein distance code
> into core without actually moving the relevant SQL functions too. That
> change allowed me to modify errorMissingColumn() to make more useful
> suggestions as to what might have been intended under other
> circumstances, like when someone fat-fingers a column name. psql tab
> completion is good, but not so good that this doesn't happen all the
> time. It's good practice to consistently name columns and tables such
> that it's possible to intuit the names of columns from the names of
> tables and so on, but it's still pretty common to forget if a column
> name from the table "orders" is "order_id", "orderid", or "ordersid",
> particularly if you're someone who regularly interacts with many
> databases. This problem is annoying in a low intensity kind of way.
>
> Consider the following sample sessions of mine, made with the
> dellstore2 sample database:
>
> [local]/postgres=# select * from orders o join orderlines ol on
> o.orderid = ol.orderids limit 1;
> ERROR: 42703: column ol.orderids does not exist
> LINE 1: ...* from orders o join orderlines ol on o.orderid = ol.orderid...
> ^
> HINT: Perhaps you meant to reference the column "ol"."orderid".
> LOCATION: errorMissingColumn, parse_relation.c:2989
> [local]/postgres=# select * from orders o join orderlines ol on
> o.orderid = ol.orderid limit 1;
> orderid | orderdate | customerid | netamount | tax | totalamount |
> orderlineid | orderid | prod_id | quantity | orderdate
>
> ---------+------------+------------+-----------+-------+-------------+-------------+---------+---------+----------+------------
> 1 | 2004-01-27 | 7888 | 313.24 | 25.84 | 339.08 |
> 1 | 1 | 9117 | 1 | 2004-01-27
> (1 row)
>
> [local]/postgres=# select ordersid from orders o join orderlines ol on
> o.orderid = ol.orderid limit 1;
> ERROR: 42703: column "ordersid" does not exist
> LINE 1: select ordersid from orders o join orderlines ol on o.orderi...
> ^
> HINT: Perhaps you meant to reference the column "o"."orderid".
> LOCATION: errorMissingColumn, parse_relation.c:2999
> [local]/postgres=# select ol.ordersid from orders o join orderlines ol
> on o.orderid = ol.orderid limit 1;
> ERROR: 42703: column ol.ordersid does not exist
> LINE 1: select ol.ordersid from orders o join orderlines ol on o.ord...
> ^
> HINT: Perhaps you meant to reference the column "ol"."orderid".
> LOCATION: errorMissingColumn, parse_relation.c:2989
>
> We try to give the most useful possible HINT here, charging extra for
> a non-matching alias, and going through the range table in order and
> preferring the first column observed to any subsequent column whose
> name is of the same distance as an earlier Var. The fuzzy string
> matching works well enough that it seems possible in practice to
> successfully have the parser make the right suggestion, even when the
> user's original guess was fairly far off. I've found it works best to
> charge half as much for a character deletion, so that's what is
> charged.
>
> I have some outstanding concerns about the proposed patch:
>
> * It may be the case that dense logosyllabic or morphographic writing
> systems, for example Kanji might consistently present, say, Japanese
> users with a suggestion that just isn't very useful, to the point of
> being annoying. Perhaps some Japanese hackers can comment on the
> actual risks here.
>
> * Perhaps I should have moved the Levenshtein distance functions into
> core and be done with it. I thought that given the present restriction
> that the implementation imposes on source and target string lengths,
> it would be best to leave the user-facing SQL functions in contrib.
> That restriction is not relevant to the internal use of Levenshtein
> distance added here, though.
>
> Thoughts?
> --
> Peter Geoghegan
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-03-28 08:22:09 Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Previous Message Dean Rasheed 2014-03-28 07:58:51 Re: [PATCH] Negative Transition Aggregate Functions (WIP)