Re: Refactoring identifier checks to consistently use strcmp

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Refactoring identifier checks to consistently use strcmp
Date: 2017-09-05 08:34:43
Message-ID: 5AC0CFF5-062D-4421-BAEE-3EE703F62750@yesql.se
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 17 Aug 2017, at 11:08, Daniel Gustafsson <daniel(at)yesql(dot)se> wrote:
>
>> On 16 Aug 2017, at 17:51, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
>>> This no longer works:
>>
>>> postgres=# CREATE TEXT SEARCH DICTIONARY public.simple_dict (
>>> TEMPLATE = pg_catalog.simple,
>>> "STOPWORds" = english
>>> );
>>> ERROR: unrecognized simple dictionary parameter: "STOPWORds"
>>
>>> In hindsight, perhaps we should always have been more strict about that
>>> to begin wtih, but let's not break backwards-compatibility without a
>>> better reason. I didn't thoroughly check all of the cases here, to see
>>> if there are more like this.
>>
>> You have a point, but I'm not sure that this is such a bad compatibility
>> break as to be a reason not to change things to be more consistent.
>
> I agree with this, but I admittedly have no idea how common the above case
> would be in the wild.
>
>>> It'd be nice to have some kind of a rule on when pg_strcasecmp should be
>>> used and when strcmp() is enough. Currently, by looking at the code, I
>>> can't tell.
>>
>> My thought is that if we are looking at words that have been through the
>> parser, then it should *always* be plain strcmp; we should expect that
>> the parser already did the appropriate case-folding.
>
> +1
>
>> pg_strcasecmp would be appropriate, perhaps, if we're dealing with stuff
>> that somehow came in without going through the parser.
>
> In that case it would be up to the consumer of the data to handle required
> case-folding for the expected input, so pg_strcasecmp or strcmp depending on
> situation.

This patch has been marked “Waiting on Author”, but I’m not sure what the
concensus of this thread came to with regards to quoted keywords and backwards
compatibility. There seems to be a 2-1 vote for allowing a break, and forcing
all keywords out of the parser to be casefolded. Any other opinions?

cheers ./daniel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Haribabu Kommi 2017-09-05 08:44:44 Re: Allow INSTEAD OF DELETE triggers to modify the tuple for RETURNING
Previous Message Amit Kapila 2017-09-05 08:34:31 Re: why not parallel seq scan for slow functions