| From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> | 
|---|---|
| To: | Johannes Graën <johannes(at)selfnet(dot)de> | 
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Degression (PG10 > 11, 12 or 13) | 
| Date: | 2021-05-28 15:48:33 | 
| Message-ID: | CAFj8pRAo0wjUc_eU1tKs71fQChQRad07Wqz0gkByx5bVuwEiSA@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
hI
pá 28. 5. 2021 v 16:12 odesílatel Johannes Graën <johannes(at)selfnet(dot)de>
napsal:
> Hi,
>
> When trying to upgrade an existing database from version 10 to 13 I came
> across a degression in some existing code used by clients. Further
> investigations showed that performance measures are similar in versions
> 11 to 13, while in the original database on version 10 it's around 100
> times faster. I could boil it down to perl functions used for sorting.
>
> >From the real data that I don't own, I created a test case that is
> sufficient to observe the degression: http://ix.io/3o7f
>
>
> These are the numbers on PG 10:
>
> > test=# explain (analyze, verbose, buffers)
> > select attr from tab order by func(attr);
> >                                                       QUERY PLAN
> >
> ----------------------------------------------------------------------------------------------------------------------
> >  Sort  (cost=3269.68..3294.36 rows=9869 width=40) (actual
> time=179.374..180.558 rows=9869 loops=1)
> >    Output: attr, (func(attr))
> >    Sort Key: (func(tab.attr))
> >    Sort Method: quicksort  Memory: 1436kB
> >    Buffers: shared hit=49
> >    ->  Seq Scan on public.tab  (cost=0.00..2614.94 rows=9869 width=40)
> (actual time=2.293..169.060 rows=9869 loops=1)
> >          Output: attr, func(attr)
> >          Buffers: shared hit=49
> >  Planning time: 0.318 ms
> >  Execution time: 182.061 ms
> > (10 rows)
> >
> > test=# explain (analyze, verbose, buffers)
> > select attr from tab;
> >                                                  QUERY PLAN
> >
> ------------------------------------------------------------------------------------------------------------
> >  Seq Scan on public.tab  (cost=0.00..147.69 rows=9869 width=8) (actual
> time=0.045..3.975 rows=9869 loops=1)
> >    Output: attr
> >    Buffers: shared hit=49
> >  Planning time: 0.069 ms
> >  Execution time: 6.020 ms
> > (5 rows)
>
>
> And here we have PG 11:
>
> > test=# explain (analyze, verbose, buffers)
> > select attr from tab order by func(attr);
> >                                                       QUERY PLAN
> >
> ----------------------------------------------------------------------------------------------------------------------
> >  Sort  (cost=3269.68..3294.36 rows=9869 width=40) (actual
> time=597.877..599.805 rows=9869 loops=1)
> >    Output: attr, (func(attr))
> >    Sort Key: (func(tab.attr))
> >    Sort Method: quicksort  Memory: 1436kB
> >    Buffers: shared hit=49
> >    ->  Seq Scan on public.tab  (cost=0.00..2614.94 rows=9869 width=40)
> (actual time=0.878..214.188 rows=9869 loops=1)
> >          Output: attr, func(attr)
> >          Buffers: shared hit=49
> >  Planning Time: 0.151 ms
> >  Execution Time: 601.767 ms
> > (10 rows)
> >
> > test=# explain (analyze, verbose, buffers)
> > select attr from tab;
> >                                                  QUERY PLAN
> >
> ------------------------------------------------------------------------------------------------------------
> >  Seq Scan on public.tab  (cost=0.00..147.69 rows=9869 width=8) (actual
> time=0.033..1.628 rows=9869 loops=1)
> >    Output: attr
> >    Buffers: shared hit=49
> >  Planning Time: 0.043 ms
> >  Execution Time: 2.581 ms
> > (5 rows)
>
>
> In the real scenario it's 500ms vs. 50s. The reason is obviously the
> perl function used as sort key. All different versions have been tested
> with an unmodified config and one tunes with pgtune. Creating a
> functional index does not help in the original database as the planner
> doesn't use it, while it *is* used in the test case. But the question
> what causes that noticeable difference in performance is untouched by
> the fact that it could be circumvented in some cases.
>
> The perl version used is v5.24.1.
>
I looked on profile - Postgres 14
   5,67%  libperl.so.5.32.1                     [.] Perl_utf8_length
   5,44%  libc-2.33.so                          [.] __strcoll_l
   4,73%  libperl.so.5.32.1                     [.] Perl_pp_subst
   4,33%  libperl.so.5.32.1                     [.] Perl_re_intuit_start
   3,25%  libperl.so.5.32.1                     [.] Perl_fbm_instr
   1,92%  libperl.so.5.32.1                     [.] Perl_regexec_flags
   1,79%  libperl.so.5.32.1                     [.] Perl_runops_standard
   1,16%  libperl.so.5.32.1                     [.] Perl_pp_const
   0,97%  perf                                  [.] 0x00000000002fcf93
   0,94%  libperl.so.5.32.1                     [.] Perl_pp_nextstate
   0,68%  libperl.so.5.32.1                     [.] Perl_do_trans
   0,54%  perf                                  [.] 0x00000000003dd0c5
and Postgres - 10
   5,45%  libperl.so.5.32.1                      [.] Perl_utf8_length
   4,78%  libc-2.33.so                           [.] __strcoll_l
   4,15%  libperl.so.5.32.1                      [.] Perl_re_intuit_start
   3,92%  libperl.so.5.32.1                      [.] Perl_pp_subst
   2,99%  libperl.so.5.32.1                      [.] Perl_fbm_instr
   1,77%  libperl.so.5.32.1                      [.] Perl_regexec_flags
   1,59%  libperl.so.5.32.1                      [.] Perl_runops_standard
   1,02%  libperl.so.5.32.1                      [.] Perl_pp_const
   0,99%  [kernel]                               [k] psi_group_change
   0,85%  [kernel]                               [k] switch_mm_irqs_off
and it doesn't look too strange.
-- postgres 14
postgres=# explain (analyze, verbose, buffers)
 select attr from tab order by func(attr);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                      QUERY PLAN
                                           │
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Sort  (cost=3269.68..3294.36 rows=9869 width=40) (actual
time=246.612..247.292 rows=9869 loops=1)                    │
│   Output: attr, (func(attr))
                                            │
│   Sort Key: (func(tab.attr))
                                            │
│   Sort Method: quicksort  Memory: 1436kB
                                            │
│   Buffers: shared hit=49
                                            │
│   ->  Seq Scan on public.tab  (cost=0.00..2614.94 rows=9869 width=40)
(actual time=0.102..201.863 rows=9869 loops=1) │
│         Output: attr, func(attr)
                                            │
│         Buffers: shared hit=49
                                            │
│ Planning Time: 0.057 ms
                                           │
│ Execution Time: 248.386 ms
                                            │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(10 rows)
-- postgres 10
postgres=# explain (analyze, verbose, buffers)
 select attr from tab order by func(attr);
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                      QUERY PLAN
                                           │
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Sort  (cost=3269.68..3294.36 rows=9869 width=40) (actual
time=267.779..268.366 rows=9869 loops=1)                    │
│   Output: attr, (func(attr))
                                            │
│   Sort Key: (func(tab.attr))
                                            │
│   Sort Method: quicksort  Memory: 1436kB
                                            │
│   Buffers: shared hit=49
                                            │
│   ->  Seq Scan on public.tab  (cost=0.00..2614.94 rows=9869 width=40)
(actual time=0.278..222.606 rows=9869 loops=1) │
│         Output: attr, func(attr)
                                            │
│         Buffers: shared hit=49
                                            │
│ Planning time: 0.132 ms
                                           │
│ Execution time: 269.258 ms
                                            │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(10 rows)
This is tested on my laptop - both version uses same locale
Are you sure, so all databases use the same encoding and same locale?
Regards
Pavel
> Best
>   Johannes
>
>
>
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tomas Vondra | 2021-05-28 16:06:19 | Re: Add ZSON extension to /contrib/ | 
| Previous Message | Tomas Vondra | 2021-05-28 15:47:18 | Re: Degression (PG10 > 11, 12 or 13) |