Re: Ltree syntax improvement

From: Nikolay Shaplov <dhyan(at)nataraj(dot)su>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: Dmitry Belyavsky <beldmit(at)gmail(dot)com>
Subject: Re: Ltree syntax improvement
Date: 2019-02-05 13:26:17
Message-ID: 30381329.7fST72pxSh@x200m
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

В письме от вторник, 29 января 2019 г. 20:43:07 MSK пользователь Dmitry
Belyavsky написал:

> Please find attached the patch extending the sets of symbols allowed in
> ltree labels. The patch introduces 2 variants of escaping symbols, via
> backslashing separate symbols and via quoting the labels in whole for
> ltree, lquery and ltxtquery datatypes.

Hi!

Let's me join the review process...

I've just give a superficial look to the patch, read it through, and try here
and there, so first I'll give feedback about exterior features of the patch.

So, the first question: why do we need all this? The answer seems to be
obvious, but it would be good say it explicitly. What problems would it solve?
Do these problems really exists?

The second question is about coding style. As far as I can see you've passed
your code through pg_indent, but it does not solve all problems.

As far as I know, postgres commiters are quite strict about code width, the
code should not be wider that 80 col, except places were string constants are
introduced (There you can legally exceed 80 col). So I would advice to use vim
with tabstop=4 and colorcolumn=80 and make sure that new code text does not
cross red vertical line.

Comments. There is no fixed coding style for comments in postgres. Usually this
means that it is better to follow coding in the code around. But ltree does
not have much comments, so I would suggest to use the best style I've seen in
the code
/*
* [function name]
* < tab ><tab> [Short description, what does it do]
*
* [long explanation how it works, several subparagraph if needed
*/
(Seen this in src/include/utils/rel.h)
or something like that.
At least it would be good to have explicit separation between descriptions and
explanation of how it works.

And about comment
/*
* Function calculating bytes to escape
* to_escape is an array of "special" 1-byte symbols
* Behvaiour:
* If there is no "special" symbols, return 0
* If there are any special symbol, we need initial and final quote, so return
2
* If there are any quotes, we need to escape all of them and also initial and
final quote, so
* return 2 + number of quotes
*/

It has some formatting. But it should not. As far as I understand this
comment should be treated as single paragraph by reformatter, and refomatted.
I do not understand why pg_indent have not done it. Documentation (https://
www.postgresql.org/docs/11/source-format.html) claims that if you want to
avoid reformatting, you should add "-------------" at the beginning and at the
end of the comment. So it would be good either remove this formatting, or add
"----" if you are sure you want to keep it.

Third part is about error messages.
First I've seen errors like elog(ERROR, "internal error during splitting
levels");. I've never seen error like that in postgres. May be if this error
is in the part of the code, that should never be reached in real live, may be
it would be good to change it to Assert? Or if this code can be normally
reached, add some more explanation of what had happened...

About other error messages.

There are messages like
errmsg("syntax error"),
errdetail("Unexpected end of line.")));

or

errmsg("escaping syntax error")));

In postgres there is error message style guide
https://www.postgresql.org/docs/11/error-style-guide.html

Here I would expect messages like that

Primary: Error parsing ltree path
Detail: Curly quote is opened and not closed

Or something like that, I am not expert in English, but this way it would be
better for sure. Key points are: Primary message should point to general area
where error occurred (there can be a large SQL with a lot of things in it, so
it is good to point that problem is with ltree staff). And is is better to
word from the point of view of a user. What for you (and me) is unexpected end
of line, for him it is unclosed curly quote.

Also I am not sure, if it is easy, but if it is possible, it would be good to
move "^" symbol that points to the place of the error, to the place inside ' '
where unclosed curly quote is located. As a user I would appreciate that.

So that's what I've seen while giving it superficial look...

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-02-05 14:25:28 Re: Memory contexts reset for trigger invocations
Previous Message Kyotaro HORIGUCHI 2019-02-05 12:30:24 Re: What happens if checkpoint haven't completed until the next checkpoint interval or max_wal_size?