Re: BUG #16241: Degraded hash join performance

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Butz <tbutz(at)optitool(dot)de>, pgsql-bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16241: Degraded hash join performance
Date: 2020-02-04 16:29:01
Message-ID: 20200204162901.s5hbfrl2ylb3jjsq@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On 2020-02-04 11:00:29 -0500, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > Interesting! The no-children one clearly shows that a lot of the the
> > time is spent evaluating regular expressions (there's other regex
> > functions in the profile too):
> > 23.36% postgres postgres [.] subcolor
>
> Huh ...
>
> > I'm not aware of any relevant regular expression evaluation changes
> > between 11 and 12. Tom, does this trigger anything?
>
> (1) Nope, I'm not either; the last non-back-patched change in that
> code was c54159d44 in v10.
>
> (2) subcolor() is part of regex compilation, not execution, which makes
> one wonder why it's showing up at all. Maybe the regex cache in
> adt/regexp.c is overflowing and preventing useful caching? But
> that didn't change in v12 either. Are these test cases really
> 100% equivalent? I'm wondering if there are a few more "hot"
> regex patterns in the v12 data ...

They are not 100% equivalent, but the part of the plan we see is very
similar rowcount wise. It's possible that the functions differ more
however, there are different postgis versions involved, and apparently
also an "osml10n" extension.

> (3) Where the heck is the regex use coming from at all? I don't
> see any regex operators in the plan. Maybe it's inside the
> plpgsql function?

It definitely is. The stack shows at least two levels of plpgsql
functions. And Thomas has since confirmed that removing the functioncall
fixes the issue.

Based on the name I think this is somewhere around this:
https://github.com/giggls/mapnik-german-l10n/blob/master/plpgsql/get_localized_name_from_tags.sql#L120
The callgraph indicates that most of the cost comes from within
textregexreplace_noopt.

Not clear why the cache isn't fixing this - there are no variables in
the regexp_replace calls as far as I can see.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jehan-Guillaume de Rorthais 2020-02-04 17:39:06 FK violation in partitioned table after truncating a referenced partition
Previous Message Tom Lane 2020-02-04 16:27:38 Re: BUG #16243: non super user take pg_restore found some errors.