Re: [HACKERS] Refactoring identifier checks to consistently use strcmp

From: Daniel Gustafsson <daniel(at)yesql(dot)se>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Refactoring identifier checks to consistently use strcmp
Date: 2018-01-10 22:49:47
Message-ID: D5F34C9D-3AB7-4419-AF2E-12F67581D71D@yesql.se
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

> On 07 Jan 2018, at 01:17, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> Would be great to hear your thoughts, Daniel, so leaving this in Waiting on
> Author for now.

I am still of the opinion that it’s worth going through and ensuring that we
are matching against parser output in a consistent way, if only to lower the
risk of subtle hard to find bugs. This patch is a first stab, but there are
more string comparisons to consider should we decide to ahead with this patch
(or the general approach in some other fashion than this implementation).

Collating the responses so far with an updated patch, thanks to everyone who
has reviewed this patch! Sorry being slow to respond, $life hasn’t allowed for
much hacking lately.

> On 30 Nov 2017, at 20:14, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> I am having a bit of trouble understanding why the first hunk in
> DefineAggregate() is taking PARALLEL as a special case. The
> documentation gives three separate synopses for CREATE AGGREGATE, but
> parallel appears in all of them, and there are other options that do
> too, so the comment doesn't really help me understand why it's special
> as compared to other, similar options.

The reason for the special handling of parallel in the old-style CREATE
AGGREGATE syntax is that it’s parsed with IDENT rather than ColLabel. AFAICT
that works for all parameters except parallel as it’s an unreserved keyword
since 7aea8e4f2da. Extending old_aggr_elem to handle PARALLEL separately seems
to work for not requiring the quoting, but I may be missing something as the
parser hacking isn’t my speciality. The patch has been updated with this (and
the documentation + tests changes to go with it) but it clearly needs a close
eye.

> I think the changes in DefineView and ATExecSetRelOptions is wrong,
> because transformRelOptions() is still using pg_strcasecmp.

Correct, I had missed that reloptions case, sorry about that. The hunk in
AlterTableGetRelOptionsLockLevel() seems correct to me, and testing didn’t
break anything, but I wonder if there is a case where it would need
pg_strncasecmp?

> On 08 Jan 2018, at 17:27, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> nor has anyone taken the trouble to list with precision all of the
> places where the behavior will change. I think the latter is
> absolutely indispensable,

I had a look at the available commands in postgres and compiled a list of them
in options.sql based on if they have options, and how those options and matched
(case sensitive of insensitive). The queries in the file are nonsensical, they
are just meant to show the handling of the options. The idea was to illustrate
the impact of this proposal by running examples. Running this file with and
without the patches applied shows the following commands being affected:

CREATE TABLE
CREATE TABLE AS
ALTER TABLE
CREATE TABLESPACE
ALTER TABLESPACE
CREATE VIEW
ALTER VIEW
CREATE INDEX
ALTER INDEX
CREATE AGGREGATE (new and old syntax)
CREATE COLLATION
CREATE FUNCTION
CREATE OPERATOR
ALTER OPERATOR
CREATE TYPE
CREATE TEXT SEARCH CONFIGURATION
CREATE TEXT SEARCH DICTIONARY
ALTER TEXT SEARCH DICTIONARY

The output with the patch is attached as options_patched.out, and the output
from master as options_master.out. Diffing the two files is rather helpful I
think.

All possible options aren’t excercised, and I hope I didn’t miss any statements
that should’ve been covered. The options.sql file makes it quite obvious that
we currently have quite a mix of case sensitive and insensitive commands. Is
this in line with what you were thinking Robert? I’m definitely up for better
ideas.

One open question from this excercise is how to write a good test for this. It
can either be made part of the already existing test queries or a separate
suite. I’m leaning on the latter simply because the case-flipping existing
tests seems like something that can be cleaned up years from now accidentally
because it looks odd.

> On 07 Jan 2018, at 01:38, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> ISTM that if this patch gets rid of a large fraction of the uses of
> pg_strcasecmp in the backend, which is what I expect it should, then
> it might not be out of reach to go through all the surviving ones to
> make sure they are not processing strings that originate in the parser.

I completely agree, but I have not expanded this patch with this. One case was
actually instead put back, since I did see a mention in the documentation for
isStrict and isCachable they aren’t case sensitive. Instead I removed that
section from the documentation in a hope that we some day can make these case
sensitive.

Additionally, while poking at the commands I noticed an inconsistency in
checking for conflicting options in CREATE FUNCTION. The below statement is
correctly erroring on IMMUTABLE and VOLATILE being conflicting:

create function int42(cstring) returns int42 AS 'int4in'
language internal strict immutable volatile;

If you however combine the new and old syntax, the statement works and the WITH
option wins by overwriting any previous value. The below statement creates an
IMMUTABLE function:

create function int42(cstring) returns int42 AS 'int4in'
language internal strict volatile with (isstrict, iscachable);

It’s arguably a pretty silly statement to write, and may very well never have
been seen in production, but I would still expect that query to error out.
Attached volatility.patch fixes it in a hacky way to behave like the former
statement, there is probably a cleaner way but I didn’t want to spend too much
time on it before hearing if others think it’s not worth fixing.

Thanks for looking at this!

cheers ./daniel

Attachment Content-Type Size
defname_strcmp-v4.patch application/octet-stream 33.3 KB
volatility.patch application/octet-stream 5.6 KB
options_master.out application/octet-stream 34.3 KB
options_patched.out application/octet-stream 37.9 KB
options.sql application/octet-stream 24.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2018-01-10 22:58:32 Re: Fix a Oracle-compatible instr function in the documentation
Previous Message Peter Geoghegan 2018-01-10 22:48:40 Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)