|From:||Chapman Flack <chap(at)anastigmatix(dot)net>|
|To:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>|
|Cc:||Erik Rijkers <er(at)xs4all(dot)nl>, pgsql-hackers(at)lists(dot)postgresql(dot)org|
|Subject:||Re: Define jsonpath functions as stable|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On 09/18/19 17:12, Tom Lane wrote:
> After further reading, it seems like what that text is talking about
> is not actually a regex feature, but an outgrowth of the fact that
> the regex pattern is being expressed as a string literal in a language
> for which XML character entities are a native aspect of the string
> literal syntax. So it looks to me like the entities get folded to
> raw characters in a string-literal parser before the regex engine
> ever sees them.
Hmm. That occurred to me too, but I thought the explicit mention of
'character reference' in the section specific to regexes might not
mean that. It certainly could have been clearer.
But you seem to have the practical agreement of both BaseX:
let $foo := codepoints-to-string((38,35,120,54,49,59))
return ($foo, matches('a', $foo))
and the Saxon-based pljava example:
select occurrences_regex('a', 'a', w3cNewlines => true);
> As such, I think this doesn't apply to SQL/JSON. The SQL/JSON spec
> in either of those languages you have backslash escape sequences
> for writing weird characters, *not* XML entities. You certainly
> wouldn't have use of such entities in a native implementation of
> LIKE_REGEX in SQL.
So yeah, that seems to be correct.
The upshot seems to be a two-parter:
1. Whatever string literal syntax is used in front of the regex engine
had better have some way to represent any character you could want
to match, and
2. There is only one way to literally match a character that is a regex
metacharacter, namely, to precede it with a backslash (that the regex
engine will see; therefore doubled if necessary). Whatever codepoint
escape form might be available in the string literal syntax does not
offer another way to do that, because it happens too early, before
the regex engine can see it.
> So now I'm thinking we can just remove the handwaving about entities.
> On the other hand, this points up a large gap in our docs about
> SQL/JSON, which is that nowhere does it even address the question of
> what the string literal syntax is within a path expression.
That does seem like it ought to be covered.
|Next Message||Thomas Munro||2019-09-18 22:51:09||Re: [PATCH] src/test/modules/dummy_index -- way to test reloptions from inside of access method|
|Previous Message||Alvaro Herrera||2019-09-18 22:18:13||Re: Optimization of some jsonb functions|