Re: Small improvements to substring()

From: Zsolt Parragi <zsolt(dot)parragi(at)percona(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Junwang Zhao <zhjwpku(at)gmail(dot)com>
Subject: Re: Small improvements to substring()
Date: 2026-02-16 21:05:11
Message-ID: CAN4CZFPgL6NyFDLZCvfwHygNRy1F1L1CihRJV-a7_hVerjZ_Hw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello!

The patch looks good, but I think there's an edge case where it causes
a performance regression with a negative start and long strings,
because of the removal of this condition:

- /*
- * Ending at position 1, exclusive, obviously yields an empty
- * string. A zero or negative value can happen if the start was
- * negative or one. SQL99 says to return a zero-length string.
- */
- if (E <= 1)
- return cstring_to_text("");
-

See the following example:

CREATE TEMP TABLE toast_large (c text);
INSERT INTO toast_large VALUES (repeat(U&'\2026', 1000000));

\timing on
SELECT length(substring(c, -5, 3)) FROM toast_large;
\timing off

Without the patch it's 0.4ms
With the patch this executes in 2ms
With the patch and the condition added back it's 0.4ms again

(all release build times on my pc)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2026-02-16 21:13:37 Re: Lowering the default wal_blocksize to 4K
Previous Message Andres Freund 2026-02-16 21:00:56 Re: Adding locks statistics