Re: New string functions; initdb required

From: "Ken Hirsch" <kenhirsch(at)myself(dot)com>
To: "hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: New string functions; initdb required
Date: 2002-06-12 23:02:32
Message-ID: 00f001c21266$07770540$0000a398@DELLXP1
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thomas Lockhart wrote:
> Right. I'm not certain about the regex syntax defined by SQL99; I used
> the syntax that we already have enabled and it looks like we have a
> couple of other variants available if we need them. If someone wants to
> research the *actual* syntax specified by SQL99 that would be good...

As usual: ( ) + * [ ] |
Instead of dot . there is underscore _
There is % to mean .* just like LIKE
There is no ? or ^ or $

Regular expressions match the whole string, as if there were an
implicit ^ before and $ after the pattern. You have to add % if
you want to match anywhere in a string.

As far as I can tell, there is no default escape character like \
but you can specify one.

8.6 Similar predicate
Function
Specify a character string similarity by means of a regular expression.

Format

<similar predicate> ::=
<character match value> [ NOT ] SIMILAR TO <similar pattern>
[ ESCAPE <escape character> ]

<similar pattern> ::= <character value expression>

<regular expression> ::=
<regular term>
| <regular expression> <vertical bar> <regular term>

<regular term> ::=
<regular factor>
| <regular term> <regular factor>

<regular factor> ::=
<regular primary>
| <regular primary> <asterisk>
| <regular primary> <plus sign>

<regular primary> ::=
<character specifier>
| <percent>
| <regular character set>
| <left paren> <regular expression> <right paren>

<character specifier> ::=
<non-escaped character>
| <escaped character>

<non-escaped character> ::= !! See the Syntax Rules
<escaped character> ::= !! See the Syntax Rules

<regular character set> ::=
<underscore>
| <left bracket> <character enumeration>... <right bracket>
| <left bracket> <circumflex>
<character enumeration>... <right bracket>
| <left bracket> <colon> <regular character set identifier>
<colon> <right bracket>

<character enumeration> ::=
<character specifier>
| <character specifier> <minus sign> <character specifier>

<regular character set identifier> ::= <identifier>

*stuff omitted*

3) The value of the <identifier> that is a <regular character set
identifier> shall be either ALPHA, UPPER, LOWER, DIGIT, or ALNUM.

*collating stuff omitted*

5) A <non-escaped character> is any single character from the
character set of the <similar pattern> that is not a <left bracket>,
<right bracket>, <left paren>, <right paren>, <vertical bar>,
<circumflex>, <minus sign>, <plus sign>, <asterisk>, <underscore>,
<percent>, or the character specified by the result of the <character
value expression> of <escape character>. A <character specifier> that
is a <non-escaped character> represents itself.

6) An <escaped character> is a sequence of two characters: the
character specified by the result of the <character value expression>
of <escape character>, followed by a second character that is a <left
bracket>, <right bracket>, <left paren>, <right paren>, <vertical
bar>, <circumflex>, <minus sign>, <plus sign>, <asterisk>,
<underscore>, <percent>, or the character specified by the result of
the <character value expression> of <escape character>. A <character
specifier> that is an <escaped character> represents its second
character.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dann Corbit 2002-06-12 23:09:52 Re: PostGres Doubt
Previous Message Bruce Momjian 2002-06-12 22:30:06 Re: Integrating libpqxx