Re: Degression (PG10 > 11, 12 or 13)

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Johannes Graën <johannes(at)selfnet(dot)de>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Degression (PG10 > 11, 12 or 13)
Date: 2021-05-28 15:48:33
Message-ID: CAFj8pRAo0wjUc_eU1tKs71fQChQRad07Wqz0gkByx5bVuwEiSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

hI

pá 28. 5. 2021 v 16:12 odesílatel Johannes Graën <johannes(at)selfnet(dot)de>
napsal:

> Hi,
>
> When trying to upgrade an existing database from version 10 to 13 I came
> across a degression in some existing code used by clients. Further
> investigations showed that performance measures are similar in versions
> 11 to 13, while in the original database on version 10 it's around 100
> times faster. I could boil it down to perl functions used for sorting.
>
> >From the real data that I don't own, I created a test case that is
> sufficient to observe the degression: http://ix.io/3o7f
>
>
> These are the numbers on PG 10:
>
> > test=# explain (analyze, verbose, buffers)
> > select attr from tab order by func(attr);
> > QUERY PLAN
> >
> ----------------------------------------------------------------------------------------------------------------------
> > Sort (cost=3269.68..3294.36 rows=9869 width=40) (actual
> time=179.374..180.558 rows=9869 loops=1)
> > Output: attr, (func(attr))
> > Sort Key: (func(tab.attr))
> > Sort Method: quicksort Memory: 1436kB
> > Buffers: shared hit=49
> > -> Seq Scan on public.tab (cost=0.00..2614.94 rows=9869 width=40)
> (actual time=2.293..169.060 rows=9869 loops=1)
> > Output: attr, func(attr)
> > Buffers: shared hit=49
> > Planning time: 0.318 ms
> > Execution time: 182.061 ms
> > (10 rows)
> >
> > test=# explain (analyze, verbose, buffers)
> > select attr from tab;
> > QUERY PLAN
> >
> ------------------------------------------------------------------------------------------------------------
> > Seq Scan on public.tab (cost=0.00..147.69 rows=9869 width=8) (actual
> time=0.045..3.975 rows=9869 loops=1)
> > Output: attr
> > Buffers: shared hit=49
> > Planning time: 0.069 ms
> > Execution time: 6.020 ms
> > (5 rows)
>
>
> And here we have PG 11:
>
> > test=# explain (analyze, verbose, buffers)
> > select attr from tab order by func(attr);
> > QUERY PLAN
> >
> ----------------------------------------------------------------------------------------------------------------------
> > Sort (cost=3269.68..3294.36 rows=9869 width=40) (actual
> time=597.877..599.805 rows=9869 loops=1)
> > Output: attr, (func(attr))
> > Sort Key: (func(tab.attr))
> > Sort Method: quicksort Memory: 1436kB
> > Buffers: shared hit=49
> > -> Seq Scan on public.tab (cost=0.00..2614.94 rows=9869 width=40)
> (actual time=0.878..214.188 rows=9869 loops=1)
> > Output: attr, func(attr)
> > Buffers: shared hit=49
> > Planning Time: 0.151 ms
> > Execution Time: 601.767 ms
> > (10 rows)
> >
> > test=# explain (analyze, verbose, buffers)
> > select attr from tab;
> > QUERY PLAN
> >
> ------------------------------------------------------------------------------------------------------------
> > Seq Scan on public.tab (cost=0.00..147.69 rows=9869 width=8) (actual
> time=0.033..1.628 rows=9869 loops=1)
> > Output: attr
> > Buffers: shared hit=49
> > Planning Time: 0.043 ms
> > Execution Time: 2.581 ms
> > (5 rows)
>
>
> In the real scenario it's 500ms vs. 50s. The reason is obviously the
> perl function used as sort key. All different versions have been tested
> with an unmodified config and one tunes with pgtune. Creating a
> functional index does not help in the original database as the planner
> doesn't use it, while it *is* used in the test case. But the question
> what causes that noticeable difference in performance is untouched by
> the fact that it could be circumvented in some cases.
>
> The perl version used is v5.24.1.
>

I looked on profile - Postgres 14

5,67% libperl.so.5.32.1 [.] Perl_utf8_length
5,44% libc-2.33.so [.] __strcoll_l
4,73% libperl.so.5.32.1 [.] Perl_pp_subst
4,33% libperl.so.5.32.1 [.] Perl_re_intuit_start
3,25% libperl.so.5.32.1 [.] Perl_fbm_instr
1,92% libperl.so.5.32.1 [.] Perl_regexec_flags
1,79% libperl.so.5.32.1 [.] Perl_runops_standard
1,16% libperl.so.5.32.1 [.] Perl_pp_const
0,97% perf [.] 0x00000000002fcf93
0,94% libperl.so.5.32.1 [.] Perl_pp_nextstate
0,68% libperl.so.5.32.1 [.] Perl_do_trans
0,54% perf [.] 0x00000000003dd0c5

and Postgres - 10

5,45% libperl.so.5.32.1 [.] Perl_utf8_length
4,78% libc-2.33.so [.] __strcoll_l
4,15% libperl.so.5.32.1 [.] Perl_re_intuit_start
3,92% libperl.so.5.32.1 [.] Perl_pp_subst
2,99% libperl.so.5.32.1 [.] Perl_fbm_instr
1,77% libperl.so.5.32.1 [.] Perl_regexec_flags
1,59% libperl.so.5.32.1 [.] Perl_runops_standard
1,02% libperl.so.5.32.1 [.] Perl_pp_const
0,99% [kernel] [k] psi_group_change
0,85% [kernel] [k] switch_mm_irqs_off

and it doesn't look too strange.

-- postgres 14
postgres=# explain (analyze, verbose, buffers)
select attr from tab order by func(attr);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN

╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Sort (cost=3269.68..3294.36 rows=9869 width=40) (actual
time=246.612..247.292 rows=9869 loops=1) │
│ Output: attr, (func(attr))

│ Sort Key: (func(tab.attr))

│ Sort Method: quicksort Memory: 1436kB

│ Buffers: shared hit=49

│ -> Seq Scan on public.tab (cost=0.00..2614.94 rows=9869 width=40)
(actual time=0.102..201.863 rows=9869 loops=1) │
│ Output: attr, func(attr)

│ Buffers: shared hit=49

│ Planning Time: 0.057 ms

│ Execution Time: 248.386 ms

└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(10 rows)

-- postgres 10
postgres=# explain (analyze, verbose, buffers)
select attr from tab order by func(attr);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN

╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Sort (cost=3269.68..3294.36 rows=9869 width=40) (actual
time=267.779..268.366 rows=9869 loops=1) │
│ Output: attr, (func(attr))

│ Sort Key: (func(tab.attr))

│ Sort Method: quicksort Memory: 1436kB

│ Buffers: shared hit=49

│ -> Seq Scan on public.tab (cost=0.00..2614.94 rows=9869 width=40)
(actual time=0.278..222.606 rows=9869 loops=1) │
│ Output: attr, func(attr)

│ Buffers: shared hit=49

│ Planning time: 0.132 ms

│ Execution time: 269.258 ms

└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(10 rows)

This is tested on my laptop - both version uses same locale

Are you sure, so all databases use the same encoding and same locale?

Regards

Pavel

> Best
> Johannes
>
>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-05-28 16:06:19 Re: Add ZSON extension to /contrib/
Previous Message Tomas Vondra 2021-05-28 15:47:18 Re: Degression (PG10 > 11, 12 or 13)