Re: Ltree syntax improvement

From: Filip Rembiałkowski <plk(dot)zuber(at)gmail(dot)com>
To: Chris Travers <chris(dot)travers(at)adjust(dot)com>
Cc: Nikolay Shaplov <dhyan(at)nataraj(dot)su>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dmitry Belyavsky <beldmit(at)gmail(dot)com>
Subject: Re: Ltree syntax improvement
Date: 2019-03-07 12:55:20
Message-ID: CAP_rww=E1_n1KYCfGRob7n9-1+iBWMPjCHHzBL+8Uf2rPZ9jSg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Good day.

Sorry to pop in, but if you are active users of ltree, please let me
know if you rely on the exclamation operator (negative matching)?
I have just filed a patch, fixing very old bug - and it changes the
logic substantially.
https://commitfest.postgresql.org/23/2054/ - I'd be grateful for your
comments (please use the other thread).

On Thu, Mar 7, 2019 at 1:10 PM Chris Travers <chris(dot)travers(at)adjust(dot)com> wrote:
>
>
>
> On Tue, Feb 5, 2019 at 2:27 PM Nikolay Shaplov <dhyan(at)nataraj(dot)su> wrote:
>>
>> В письме от вторник, 29 января 2019 г. 20:43:07 MSK пользователь Dmitry
>> Belyavsky написал:
>>
>> > Please find attached the patch extending the sets of symbols allowed in
>> > ltree labels. The patch introduces 2 variants of escaping symbols, via
>> > backslashing separate symbols and via quoting the labels in whole for
>> > ltree, lquery and ltxtquery datatypes.
>>
>> Hi!
>>
>> Let's me join the review process...
>>
>> I've just give a superficial look to the patch, read it through, and try here
>> and there, so first I'll give feedback about exterior features of the patch.
>>
>> So, the first question: why do we need all this? The answer seems to be
>> obvious, but it would be good say it explicitly. What problems would it solve?
>> Do these problems really exists?
>
>
> Yes, it does. We maintain an extension (https://github.com/adjust/wltree) which has a fixed separator (::) and allows any utf-8 character in the tree.
>
> In our case we currently use our extended tree to store user-defined hierarchies of labels, which might contain whitespace, Chinese, Japanese, or Korean characters, etc.
>
> I would love to be able to deprecate our work on this extension and eventually use stock ltree.
>>
>>
>> The second question is about coding style. As far as I can see you've passed
>> your code through pg_indent, but it does not solve all problems.
>>
>> As far as I know, postgres commiters are quite strict about code width, the
>> code should not be wider that 80 col, except places were string constants are
>> introduced (There you can legally exceed 80 col). So I would advice to use vim
>> with tabstop=4 and colorcolumn=80 and make sure that new code text does not
>> cross red vertical line.
>>
>> Comments. There is no fixed coding style for comments in postgres. Usually this
>> means that it is better to follow coding in the code around. But ltree does
>> not have much comments, so I would suggest to use the best style I've seen in
>> the code
>> /*
>> * [function name]
>> * < tab ><tab> [Short description, what does it do]
>> *
>> * [long explanation how it works, several subparagraph if needed
>> */
>> (Seen this in src/include/utils/rel.h)
>> or something like that.
>> At least it would be good to have explicit separation between descriptions and
>> explanation of how it works.
>>
>> And about comment
>> /*
>> * Function calculating bytes to escape
>> * to_escape is an array of "special" 1-byte symbols
>> * Behvaiour:
>> * If there is no "special" symbols, return 0
>> * If there are any special symbol, we need initial and final quote, so return
>> 2
>> * If there are any quotes, we need to escape all of them and also initial and
>> final quote, so
>> * return 2 + number of quotes
>> */
>>
>> It has some formatting. But it should not. As far as I understand this
>> comment should be treated as single paragraph by reformatter, and refomatted.
>> I do not understand why pg_indent have not done it. Documentation (https://
>> www.postgresql.org/docs/11/source-format.html) claims that if you want to
>> avoid reformatting, you should add "-------------" at the beginning and at the
>> end of the comment. So it would be good either remove this formatting, or add
>> "----" if you are sure you want to keep it.
>>
>> Third part is about error messages.
>> First I've seen errors like elog(ERROR, "internal error during splitting
>> levels");. I've never seen error like that in postgres. May be if this error
>> is in the part of the code, that should never be reached in real live, may be
>> it would be good to change it to Assert? Or if this code can be normally
>> reached, add some more explanation of what had happened...
>>
>> About other error messages.
>>
>> There are messages like
>> errmsg("syntax error"),
>> errdetail("Unexpected end of line.")));
>>
>> or
>>
>> errmsg("escaping syntax error")));
>>
>>
>> In postgres there is error message style guide
>> https://www.postgresql.org/docs/11/error-style-guide.html
>>
>> Here I would expect messages like that
>>
>> Primary: Error parsing ltree path
>> Detail: Curly quote is opened and not closed
>>
>> Or something like that, I am not expert in English, but this way it would be
>> better for sure. Key points are: Primary message should point to general area
>> where error occurred (there can be a large SQL with a lot of things in it, so
>> it is good to point that problem is with ltree staff). And is is better to
>> word from the point of view of a user. What for you (and me) is unexpected end
>> of line, for him it is unclosed curly quote.
>>
>> Also I am not sure, if it is easy, but if it is possible, it would be good to
>> move "^" symbol that points to the place of the error, to the place inside ' '
>> where unclosed curly quote is located. As a user I would appreciate that.
>>
>> So that's what I've seen while giving it superficial look...
>>
>>
>
>
> --
> Best Regards,
> Chris Travers
> Head of Database
>
> Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
> Saarbrücker Straße 37a, 10405 Berlin
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-03-07 13:03:19 Re: pgsql: Removed unused variable, openLogOff.
Previous Message Alvaro Herrera 2019-03-07 12:46:54 Re: pg_dump multi VALUES INSERT