Degression (PG10 > 11, 12 or 13)

From: Johannes Graën <johannes(at)selfnet(dot)de>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Degression (PG10 > 11, 12 or 13)
Date: 2021-05-28 14:12:33
Message-ID: 9b836af1-a189-0bc2-d6d7-3af278e227ef@selfnet.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

When trying to upgrade an existing database from version 10 to 13 I came
across a degression in some existing code used by clients. Further
investigations showed that performance measures are similar in versions
11 to 13, while in the original database on version 10 it's around 100
times faster. I could boil it down to perl functions used for sorting.

>From the real data that I don't own, I created a test case that is
sufficient to observe the degression: http://ix.io/3o7f

These are the numbers on PG 10:

> test=# explain (analyze, verbose, buffers)
> select attr from tab order by func(attr);
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------
> Sort (cost=3269.68..3294.36 rows=9869 width=40) (actual time=179.374..180.558 rows=9869 loops=1)
> Output: attr, (func(attr))
> Sort Key: (func(tab.attr))
> Sort Method: quicksort Memory: 1436kB
> Buffers: shared hit=49
> -> Seq Scan on public.tab (cost=0.00..2614.94 rows=9869 width=40) (actual time=2.293..169.060 rows=9869 loops=1)
> Output: attr, func(attr)
> Buffers: shared hit=49
> Planning time: 0.318 ms
> Execution time: 182.061 ms
> (10 rows)
>
> test=# explain (analyze, verbose, buffers)
> select attr from tab;
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
> Seq Scan on public.tab (cost=0.00..147.69 rows=9869 width=8) (actual time=0.045..3.975 rows=9869 loops=1)
> Output: attr
> Buffers: shared hit=49
> Planning time: 0.069 ms
> Execution time: 6.020 ms
> (5 rows)

And here we have PG 11:

> test=# explain (analyze, verbose, buffers)
> select attr from tab order by func(attr);
> QUERY PLAN
> ----------------------------------------------------------------------------------------------------------------------
> Sort (cost=3269.68..3294.36 rows=9869 width=40) (actual time=597.877..599.805 rows=9869 loops=1)
> Output: attr, (func(attr))
> Sort Key: (func(tab.attr))
> Sort Method: quicksort Memory: 1436kB
> Buffers: shared hit=49
> -> Seq Scan on public.tab (cost=0.00..2614.94 rows=9869 width=40) (actual time=0.878..214.188 rows=9869 loops=1)
> Output: attr, func(attr)
> Buffers: shared hit=49
> Planning Time: 0.151 ms
> Execution Time: 601.767 ms
> (10 rows)
>
> test=# explain (analyze, verbose, buffers)
> select attr from tab;
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------
> Seq Scan on public.tab (cost=0.00..147.69 rows=9869 width=8) (actual time=0.033..1.628 rows=9869 loops=1)
> Output: attr
> Buffers: shared hit=49
> Planning Time: 0.043 ms
> Execution Time: 2.581 ms
> (5 rows)

In the real scenario it's 500ms vs. 50s. The reason is obviously the
perl function used as sort key. All different versions have been tested
with an unmodified config and one tunes with pgtune. Creating a
functional index does not help in the original database as the planner
doesn't use it, while it *is* used in the test case. But the question
what causes that noticeable difference in performance is untouched by
the fact that it could be circumvented in some cases.

The perl version used is v5.24.1.

Best
Johannes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2021-05-28 14:22:26 Re: Add ZSON extension to /contrib/
Previous Message Greg Sabino Mullane 2021-05-28 14:10:00 Re: Reduce lock level for ALTER TABLE ... ADD CHECK .. NOT VALID