Re: Doing better at HINTing an appropriate column within errorMissingColumn()

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Ian Barwick <ian(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Jim Nasby <jim(at)nasby(dot)net>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Subject: Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Date: 2014-07-17 13:34:24
Message-ID: CAB7nPqTJ4VCjK1mH2pQ_6OkdyXU=ZjZtEuB9JC5wnVSqpqDtAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 9, 2014 at 3:56 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:

> What would that reshuffling actually look like?
>
Something like the patch 1 attached...

Btw, re-reading this thread, everybody seem to agree that this is a useful
feature, but we still do not have clear definitions of the circumstances
under which column hints should be produced, except the number (up to two).
So, putting my hands on it and biting the bullet, I have finished with the
two patches attached making the implementation clearer:
- Patch 1 moves levenshtein functions from fuzzystrmatch to core.
- Patch 2 implements the column hints, rather unchanged from original
proposition.

Patch 1 does a couple of things:
- fuzzystrmatch is dumped to 1.1, as Levenshtein functions are not part of
it anymore, and moved to core.
- Removal of the LESS_EQUAL flag that made the original submission patch
harder to understand. All the Levenshtein functions wrap a single common
function.
- Documentation is moved, and regression tests for Levenshtein functions
are added.
- Functions with costs are renamed with a suffix with costs.
After hacking this feature, I came up with the conclusion that it would be
better for the user experience to move directly into backend code all the
Levenshtein functions, instead of only moving in the common wrapper as
Peter did in his original patches. This is done this way to avoid keeping
portions of the same feature in two different places of the code (backend
with common routine, fuzzystrmatch with levenshtein functions) and
concentrate all the logic in a single place. Now, we may as well consider
renaming the levenshtein functions into smarter names, like str_distance,
and keep fuzzystrmatch to 1.0, having the functions levenshteing_* calling
only the str_distance functions.

Having a set of in-core distance functions for strings would serve more
general purposes like other object hinting (constraint names, tables, etc.).

Patch 2 is a rebase of the feature of Peter that can be applied on top of
patch 1. The code is rather untouched (haven't much played with Peter's
thingies), well-commented, but I think that this needs more work,
particularly when a query has a single RTE like in this case where no hints
are proposed to the user (mentioned upthread):
create table foo (aa int, bb int);
select ab from foo; -- no hints

Before doing anything more with patch 2, we still need to define clearly
how hints should be produced, so that's clearly out-of-scope for this CF.
Patch 1, though, prepares the field for hints of all kinds, so perhaps we
could argue more on that first?

Regards,
--
Michael

Attachment Content-Type Size
0001-Move-Levenshtein-functions-to-core.patch text/x-diff 55.5 KB
0002-Support-for-column-hints.patch text/x-diff 26.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-07-17 14:26:07 Re: RLS Design
Previous Message Thor Lancelot Simon 2014-07-17 13:27:33 Re: PostgreSQL for VAX on NetBSD/OpenBSD