Re: Ltree syntax improvement

From: Chris Travers <chris(dot)travers(at)adjust(dot)com>
To: Nikolay Shaplov <dhyan(at)nataraj(dot)su>
Cc: Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dmitry Belyavsky <beldmit(at)gmail(dot)com>
Subject: Re: Ltree syntax improvement
Date: 2019-03-07 12:09:49
Message-ID: CAN-RpxAvzVSxqsqXhksM8YRNyiO8Nrtw+eTdJ9_p3CBXBU3F5w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 5, 2019 at 2:27 PM Nikolay Shaplov <dhyan(at)nataraj(dot)su> wrote:

> В письме от вторник, 29 января 2019 г. 20:43:07 MSK пользователь Dmitry
> Belyavsky написал:
>
> > Please find attached the patch extending the sets of symbols allowed in
> > ltree labels. The patch introduces 2 variants of escaping symbols, via
> > backslashing separate symbols and via quoting the labels in whole for
> > ltree, lquery and ltxtquery datatypes.
>
> Hi!
>
> Let's me join the review process...
>
> I've just give a superficial look to the patch, read it through, and try
> here
> and there, so first I'll give feedback about exterior features of the
> patch.
>
> So, the first question: why do we need all this? The answer seems to be
> obvious, but it would be good say it explicitly. What problems would it
> solve?
> Do these problems really exists?
>

Yes, it does. We maintain an extension (https://github.com/adjust/wltree)
which has a fixed separator (::) and allows any utf-8 character in the tree.

In our case we currently use our extended tree to store user-defined
hierarchies of labels, which might contain whitespace, Chinese, Japanese,
or Korean characters, etc.

I would love to be able to deprecate our work on this extension and
eventually use stock ltree.

>
> The second question is about coding style. As far as I can see you've
> passed
> your code through pg_indent, but it does not solve all problems.
>
> As far as I know, postgres commiters are quite strict about code width,
> the
> code should not be wider that 80 col, except places were string constants
> are
> introduced (There you can legally exceed 80 col). So I would advice to use
> vim
> with tabstop=4 and colorcolumn=80 and make sure that new code text does
> not
> cross red vertical line.
>
> Comments. There is no fixed coding style for comments in postgres. Usually
> this
> means that it is better to follow coding in the code around. But ltree
> does
> not have much comments, so I would suggest to use the best style I've seen
> in
> the code
> /*
> * [function name]
> * < tab ><tab> [Short description, what does it do]
> *
> * [long explanation how it works, several subparagraph if needed
> */
> (Seen this in src/include/utils/rel.h)
> or something like that.
> At least it would be good to have explicit separation between descriptions
> and
> explanation of how it works.
>
> And about comment
> /*
> * Function calculating bytes to escape
> * to_escape is an array of "special" 1-byte symbols
> * Behvaiour:
> * If there is no "special" symbols, return 0
> * If there are any special symbol, we need initial and final quote, so
> return
> 2
> * If there are any quotes, we need to escape all of them and also initial
> and
> final quote, so
> * return 2 + number of quotes
> */
>
> It has some formatting. But it should not. As far as I understand this
> comment should be treated as single paragraph by reformatter, and
> refomatted.
> I do not understand why pg_indent have not done it. Documentation (https://
> www.postgresql.org/docs/11/source-format.html) claims that if you want to
> avoid reformatting, you should add "-------------" at the beginning and at
> the
> end of the comment. So it would be good either remove this formatting, or
> add
> "----" if you are sure you want to keep it.
>
> Third part is about error messages.
> First I've seen errors like elog(ERROR, "internal error during splitting
> levels");. I've never seen error like that in postgres. May be if this
> error
> is in the part of the code, that should never be reached in real live, may
> be
> it would be good to change it to Assert? Or if this code can be normally
> reached, add some more explanation of what had happened...
>
> About other error messages.
>
> There are messages like
> errmsg("syntax error"),
> errdetail("Unexpected end of line.")));
>
> or
>
> errmsg("escaping syntax error")));
>
>
> In postgres there is error message style guide
> https://www.postgresql.org/docs/11/error-style-guide.html
>
> Here I would expect messages like that
>
> Primary: Error parsing ltree path
> Detail: Curly quote is opened and not closed
>
> Or something like that, I am not expert in English, but this way it would
> be
> better for sure. Key points are: Primary message should point to general
> area
> where error occurred (there can be a large SQL with a lot of things in it,
> so
> it is good to point that problem is with ltree staff). And is is better to
> word from the point of view of a user. What for you (and me) is unexpected
> end
> of line, for him it is unclosed curly quote.
>
> Also I am not sure, if it is easy, but if it is possible, it would be good
> to
> move "^" symbol that points to the place of the error, to the place inside
> ' '
> where unclosed curly quote is located. As a user I would appreciate that.
>
> So that's what I've seen while giving it superficial look...
>
>
>

--
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Steele 2019-03-07 12:23:53 Re: Optimization of some jsonb functions
Previous Message Tomas Vondra 2019-03-07 11:53:30 Re: Online verification of checksums