From: | Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> |
---|---|
To: | Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Improved regular expression error message for backrefs |
Date: | 2021-08-23 00:26:40 |
Message-ID: | E77ABEF5-8CB5-4777-A654-1B1FA32D620E@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hackers,
Please find attached an improvement to the error messages given for invalid backreference usage:
select 'xyz' ~ '(.)(.)\3';
ERROR: invalid regular expression: invalid backreference number
select 'xyz' ~ '(.)(.)(?=\2)';
-ERROR: invalid regular expression: invalid backreference number
+ERROR: invalid regular expression: backreference in lookaround assertion
The first regexp is invalid because only two capture groups exist, so \3 doesn't refer to anything. The second regexp is rejected because the regular expression system does not support backreferences within lookaround assertions. (See the docs, section 9.7.3.6. Limits And Compatibility.) It is flat wrong to say the backreference number is invalid. There is a perfectly valid capture that \2 refers to.
The patch defines a new error code REG_ENOBREF in regex/regex.h right next to REG_ESUBREG from which it is split out, rather than at the end of the list. Is there a project preference to add it at the end? Certainly, that would give a shorter git diff.
Are there dependencies on the current error messages which prevent such changes?
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Distinguishing-regular-expression-backref-errors.patch | application/octet-stream | 6.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2021-08-23 01:46:42 | Re: Showing I/O timings spent reading/writing temp buffers in EXPLAIN |
Previous Message | Noah Misch | 2021-08-22 22:59:44 | Re: replay of CREATE TABLESPACE eats data at wal_level=minimal |