Re: Undocumented(?) limits on regexp functions

From: "Tels" <nospam-pg-abuse(at)bloodgate(dot)com>
To: "Andrew Gierth" <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Undocumented(?) limits on regexp functions
Date: 2018-08-14 17:01:48
Message-ID: c8dae9b84805d7b10e3a7a6f37afb1b3.squirrel@sm.webmail.pair.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Moin Andrew,

On Tue, August 14, 2018 9:16 am, Andrew Gierth wrote:
>>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>
> >> Should these limits:
>
> >> a) be removed
>
> Tom> Doubt it --- we could use the "huge" request variants, maybe, but
> Tom> I wonder whether the engine could run fast enough that you'd want
> Tom> to.
>
> I do wonder (albeit without evidence) whether the quadratic slowdown
> problem I posted a patch for earlier was ignored for so long because
> people just went "meh, regexps are slow" rather than wondering why a
> trivial splitting of a 40kbyte string was taking more than a second.

Pretty much this. :)

First of all, thank you for working in this area, this is very welcome.

We do use UTF-8 and we did notice that regexp are not actually the fastest
around, albeit we did not (yet) run into the memory limit. Mostly, because
the regexp_match* stuff we use is only used in places where the
performance is not key and the input/output is small (albeit, now that I
mention it, the quadratic behaviour might explain a few slowdowns in other
cases I need to investigate).

Anyway, in a few places we have functions that use a lot (> a dozend)
regexps that are also moderate complex (e.g. span multiple lines). In
these cases the performance was not really up to par, so I experimented
and in the end rewrote the functions in plperl. Which fixed the
performance, so we no longer had this issue.

All the best,

Tels

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-08-14 17:17:12 Re: libpq connection timeout mismanagement
Previous Message Christoph Berg 2018-08-14 16:55:37 [patch] Duplicated pq_sendfloat4/8 prototypes