SIMILAR TO expressions translate wildcards where they shouldn't

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: SIMILAR TO expressions translate wildcards where they shouldn't
Date: 2025-05-22 21:18:44
Message-ID: 16ab039d1af455652bdf4173402ddda145f2c73b.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following surprising result

SELECT 'a_b' SIMILAR TO '[_[:alpha:]]*',
'a_b' SIMILAR TO '[[:alpha:]_]*';

?column? │ ?column?
══════════╪══════════
t │ f
(1 row)

becomes clear when we look how the expressions are translated to
regular expressions:

EXPLAIN (VERBOSE, GENERIC_PLAN, COSTS OFF)
SELECT $1 SIMILAR TO '[_[:alpha:]]*',
$1 SIMILAR TO '[[:alpha:]_]*';

QUERY PLAN
══════════════════════════════════════════════════════════════════════════════════
Result
Output: ($1 ~ '^(?:[_[:alpha:]]*)$'::text), ($1 ~ '^(?:[[:alpha:].]*)$'::text)
(2 rows)

The underscore before the [:alpha:] is left alone, but the one after
it gets translated to a period. Now the underscore is a wildcard
that corresponds to the period in regular expressions, but characters
in square brackets should lose their special meaning. The code in
utils/adt/regexp.c doesn't expect that square brackets can be nested.

The attached patch fixes the bug.

Yours,
Laurenz Albe

Attachment Content-Type Size
v1-0001-Fix-SIMILAR-TO-regex-translation.patch text/x-patch 3.9 KB

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2025-05-22 23:47:40 Re: Standby server with cascade logical replication could not be properly stopped under load
Previous Message Alexey Makhmutov 2025-05-22 18:22:02 Re: Standby server with cascade logical replication could not be properly stopped under load