Re: Cluster wide option to control symbol case folding

From: "Lewis, Ian \(Microstar Laboratories\)" <ilewis(at)mstarlabs(dot)com>
To: "Greg Stark" <stark(at)mit(dot)edu>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "PostgreSQL-development" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Cluster wide option to control symbol case folding
Date: 2017-01-03 00:33:47
Message-ID: ACF85C502E55A143AB9F4ECFE960660A227BA2@mailserver2.local.mstarlabs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

gsstark(at)gmail(dot)com [mailto:gsstark(at)gmail(dot)com] On Behalf Of Greg Stark
wrote:

> But the problem with configurable quoting rules is a bit different.
> Imagine your application later decides to depend on PostGIS. So you
load the PostGIS extension and perhaps also some useful functions you
found on Stack Overflow for solving some GIS problem you have. Those
extensions will create objects and then work with those objects and may
use CamelCase for clarity -- in > fact I think PostGIS functions are
documented as CamelCase. The PostGIS extensions might not work on your
system with different case rules if they haven't been 100% consistent
with their camelCasing, and the functions from StackOverflow would be
even less likely to work.

Well, in the case of StackOverflow suggestions, I cannot remember a time
when I did not have to rewrite whatever suggestions I have found and
used. That is not to say that StackOverflow is not useful. It is
incredibly useful at times. But, the suggestions are usually fragments
showing how to do something, not solutions. And, most such suggestions
are small. And, so, relatively easy to understand and patch as needed.
Many such suggestions are not very useful verbatim anyhow. They are
useful exactly because they allow you to understand something that you
were unable to glean from the documentation. Certainly, making symbol
usage consistent is not a hard patch on a small fragment of code that
probably needs help anyhow to bring it to production grade. I would not
consider this a strong argument against having modal symbol recognition.

Your point about PostGIS, and other full or partial solutions for a
complex problem, is a more serious issue. I do not have a strong answer
to this point. However, at the least a CamelCase case defect in a tool
is a pretty easy problem to locate and submit as a patch. (I understand
that your point is not just about PostGIS, but for PostGIS itself I have
read in a few places that they quote everything already. I do not know
whether that is true or not as I have never even looked at the tool.
However, if it is true they quote everything, then they already have
their CamelCase exactly right everywhere. If they did not the symbol
lookup would fail against current PostgreSQL. Any tool that quotes
everything should work the same way against any mode as long as all
modes are case sensitive. It might be ugly, but at least it should
always work no matter what the back end case translation.)

In our own code, I actually would prefer that we were forced to always
use the same case everywhere we refer to a symbol. And a case sensitive
behavior would enforce that at testing. I do not want this because I
want to be able to define symbols that differ only in case. I want it so
that every symbol reference is exactly visually like every other symbol
reference to the same object. Even though the effect is small, I think
such consistency makes it easier to read code. Even in C we almost never
use the ability to overload on case alone except in a few rare - and
localized - cases where the code is actually clearer with such a
notation. For example, in a mathematical implementation, using a
notation where something like t acts as an index and T defines the range
of t the difference in case is very clear. Perhaps more importantly,
this use of overload on case is consistent with conventional
mathematical notation (which, in my opinion is very good where it
belongs). This is not true when dealing with TheLongSymbolWithMixedCase
vs. TheLongSymbolWithMixedcase. The human brain cannot see that
difference easily, while it can see the difference between t and T very
easily, and it can see the relationship between the two symbols more
easily than it can see the relationship between t and tmax, say. Still,
we almost never have such code running on a database server.

Anyhow, you have a good point about third party libraries and tools that
integrate with PostgreSQL. However, I for one would be willing to live
with and address that kind of issue as needed. If the behavior were
controlled at database create time, which, from the articles Tom linked,
seems to be the general consensus as the right time for such a choice
given the current implementation, then one would at least have the
option of having databases with different case rules within a cluster.
Since each session can only connect to one database, this is not a
solution to every such situation, but it would address at least some
such cases.

Ian Lewis (www.mstarlabs.com)

PS. To anyone who might know the answer: My Reply All to this group does
not seem to join to the original thread. All I am doing is Reply All
from Outlook. Is there something else I need to do to allow my responses
to join the original thread?

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2017-01-03 01:02:11 Re: Measuring replay lag
Previous Message Justin Pryzby 2017-01-03 00:32:40 Re: ALTER TABLE .. ALTER COLUMN .. ERROR: attribute .. has wrong type