Re: Doing better at HINTing an appropriate column within errorMissingColumn()

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Abhijit Menon-Sen <ams(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com>, Ian Barwick <ian(at)2ndquadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, Jim Nasby <jim(at)nasby(dot)net>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Subject: Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Date: 2014-12-23 00:34:27
Message-ID: CAM3SWZTXLPP5S73BbALv8zbVfce_mnEFV1hRWm2hBAXb6geMLg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 22, 2014 at 5:50 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Looking over the latest patch, I think we could simplify the code so
> that you don't need multiple FuzzyAttrMatchState objects. Instead of
> creating a separate one for each RTE and then merging them, just have
> one. When you find an inexact-RTE name match, set a field inside the
> FuzzyAttrMatchState -- maybe with a name like rte_penalty -- to the
> Levenshtein distance between the RTEs. Then call scanRTEForColumn()
> and pass down the same state object. Now let
> updateFuzzyAttrMatchState() work out what it needs to do based on the
> observed inter-column distance and the currently-in-force RTE penalty.

I'm afraid I don't follow. I think doing things that way makes things
less clear. Merging is useful because it allows us to consider that an
exact match might exist, which this searchRangeTableForCol() is
already tasked with today. We now look for the best match
exhaustively, or magically return immediately in the event of an exact
match, without caring about the alias correctness or distance.

Having a separate object makes this pattern apparent from the top
level, within searchRangeTableForCol(). I feel that's better.
updateFuzzyAttrMatchState() is the wrong place to put that, because
that task rightfully belongs in searchRangeTableForCol(), where the
high level diagnostic-report-generating control flow lives.

To put it another way, creating a separate object obfuscates
scanRTEForColumn(), since it's the only client of
updateFuzzyAttrMatchState(). scanRTEForColumn() is a very important
function, and right now I am only making it slightly less clear by
tasking it with caring about distance of names on top of strict binary
equality of attribute names. I don't want to push it any further.
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2014-12-23 00:43:06 Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Previous Message Andrew Gierth 2014-12-22 23:57:43 Re: Final Patch for GROUPING SETS