Bad planner performance for tables with empty tuples when using JIT

From: Jurrie Overgoor <postgresql-mailinglist(at)jurr(dot)org>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Bad planner performance for tables with empty tuples when using JIT
Date: 2020-10-27 12:13:48
Message-ID: 70ec49f7-e9c4-bea7-c689-a4dfbdd66f78@jurr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello everyone,

I'm currently in the process of upgrading our PostgreSQL installations
from 9.6 to 13. I am experiencing very slow query performance for empty
tables.

Our test environments get build from scratch every run, and thus contain
a lot of empty tables at first. We hit the issue discussed and resolved
in [1]. Problem is, this fix is not included in PG13... but JIT is :)

Some of our queries join a lot of tables, so the planner thinks this
will result in very high costs, and the JIT feature kicks in. This gives
bad performance. Executing the query is very fast of course; it's just
the planning stage that is time consuming. We do not see this behavior
in PG9.6, as there is no JIT. EXPLAIN gives the same high costs on PG9.6
and PG13, but on PG9.6 the planning time is low enough.

Now, how can I circumvent this?

- I could wait for PG14. I verified that PG14 solves my case and queries
are fast. But this would take at least until the third quarter of 2021,
so the website tells me.

- I could ask for a back-port of [2]. The commit is API breaking; is
this even a feasible option?

- I could turn off JIT in the server config, but I'd like to use the JIT
feature where it's appropriate!

- I could turn off JIT using a GUC prior to query execution. But then I
would need to detect the cases where I need to do this. (I.e. cases
where I query a table that has no tuples.) It would be very cumbersome
for me to write this in my application, and I feel this is more the
responsibility of the database than of my program.

- Another way is to fake the number of pages; set it to 1 where it is 0
everywhere. I verified that this produces fast query performance. But
fiddling with pg_class does not "feel" right... Is this okay to use in a
test setup (and maybe even in a production scenario)? Could I do the
following query just after our database is initialized:

update pg_catalog.pg_class c set relpages = 1 where c.relpages = 0 and
c.relnamespace = (select ns.oid from pg_catalog.pg_namespace ns where
ns.nspname = current_schema);

Could someone give me advice on what would be the best strategy?

With kind regads,

Jurrie

[1] https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com
[2]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=3d351d916b20534f973eda760cde17d96545d4c4

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Magnus Hagander 2020-10-27 13:40:21 Re: Security issues concerning pgsql replication
Previous Message xiebin (F) 2020-10-27 11:34:31 答复: Security issues concerning pgsql replication