Re: BUG #4044: Incorrect RegExp substring Output

From: "Rui Martins" <Rui(dot)Martins(at)PDMFC(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #4044: Incorrect RegExp substring Output
Date: 2008-03-19 11:09:15
Message-ID: 1321.B1UHWUVdEF8=.1205924955.squirrel@www.pdmfc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

> "Rui Martins" <Rui(dot)Martins(at)pdmfc(dot)com> writes:
>> Description: Incorrect RegExp substring Output
>
>> SUBSTRING( BedNo FROM '^[[:digit:]]+[a-zA-Z]*(:[[:digit:]]+)?$' )
>
> Interesting. It had never occurred to me that it's possible for the
> whole pattern to have a match when some parenthesized subexpression
> has no match. On investigation, Tcl's regex library seems to get
> this right, but textregexsubstr() doesn't. Will fix.
>
>> I would expect the result for BedNumber to be either NULL or the EMPTY
>> String, and the later seems more logical.
>
> It's going to be null. Your example has no match to the parenthesized
> substring --- a match would have to include a colon and some digits, no?

Yes, the subexpression will not match, but the entire expression will.
Taking this into account I agree that it should be NULL then, but this
should be CLEARLY stated in the MANUAL, so that the user will not have to
guess.

I believe that there should be a more detailed explanation of Substring
function in the manual, because I haven't found a specific section about
it. It is kind of scattered around the string functions page.

> regards, tom lane

Thank you for your feedback.

P.S.
Will the fix be available as a patch or just in 8.3.1 ?

See ya
Rui Martins

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Rui Martins 2008-03-19 11:42:10 Re: BUG #4044: Incorrect RegExp substring Output
Previous Message Heikki Linnakangas 2008-03-19 10:21:41 Re: 8.3 can't convert cyrillic text from 'iso-8859-5' to other cyrillic 8-bit encoding