Re: Improved regular expression error message for backrefs

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Improved regular expression error message for backrefs
Date: 2021-08-23 02:47:25
Message-ID: 58295.1629686845@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> writes:
> The patch defines a new error code REG_ENOBREF in regex/regex.h right next to REG_ESUBREG from which it is split out, rather than at the end of the list. Is there a project preference to add it at the end? Certainly, that would give a shorter git diff.
> Are there dependencies on the current error messages which prevent such changes?

Yeah: the POSIX standard says what the error codes from regcomp() are.

POSIX defines

REG_ESUBREG
Number in \digit invalid or in error.

which does seem to cover this case, so what I'd argue is that we should
improve the "invalid backreference number" text rather than invent
a nonstandard error code. Maybe about like "backreference number does
not exist or cannot be referenced from here"?

(Admittedly, there's not a huge reason why src/backend/regex/ needs to
stay compliant with the POSIX API today. But I still have ambitions to
split that out as a free-standing library someday, as Henry Spencer had
originally planned to do. So I'd rather stick to the API spec.)

It might be worth checking what text is attributed to this error code
by PCRE and other implementations of the POSIX spec.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message li jie 2021-08-23 02:49:36 Re: Is it worth pushing conditions to sublink/subplan?
Previous Message Bossart, Nathan 2021-08-23 02:31:45 Re: .ready and .done files considered harmful