BUG #16223: Performance regression between 11.6 and 12.1 in an SQL query with a recursive CTE based on function

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: schwaderer(at)ivocotec(dot)de
Subject: BUG #16223: Performance regression between 11.6 and 12.1 in an SQL query with a recursive CTE based on function
Date: 2020-01-22 06:02:40
Message-ID: 16223-908afaa0eaf46ea2@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 16223
Logged by: ChristianS
Email address: schwaderer(at)ivocotec(dot)de
PostgreSQL version: 12.1
Operating system: Ubuntu 18.04
Description:

I have a query that runs significantly slower in Postgres 12.1 than it does
in Postgres 11.6.

(I have asked on dba.stackexchange
https://dba.stackexchange.com/questions/257759/recursive-cte-based-on-function-values-significantly-slower-on-postgres-12-than
- hoping that someone would jump in and tell me why this is my fault and why
it has nothing to do with PostgreSQL itself. However, since there is still
no reaction after two days, I can be somewhat sure that it's not a very
obvious mistake of mine and maybe something you might want to
investigate.)

First, we create this simple function

CREATE OR REPLACE FUNCTION public.my_test_function()
RETURNS SETOF record
LANGUAGE sql
IMMUTABLE SECURITY DEFINER
AS $function$

SELECT
1::integer AS id,
'2019-11-20'::date AS "startDate",
'2020-01-01'::date AS "endDate"

$function$;

Then for the actual query

WITH "somePeriods" AS (
SELECT * FROM my_test_function() AS
f(id integer, "startDate" date, "endDate" date)
),

"maxRecursiveEndDate" AS (

SELECT "startDate", "endDate", id,
(
WITH RECURSIVE prep("startDateParam", "endDateParam") AS (

SELECT "startDate","endDate" FROM "somePeriods" WHERE id = od.id
UNION
SELECT "startDate","endDate" FROM "somePeriods", prep
WHERE
"startDate" <= ("endDateParam" + '1 day'::interval ) AND ("endDateParam"
+ '1 day'::interval ) <= "endDate"
)
SELECT max("endDateParam") FROM prep
) AS "endDateNew"

FROM "somePeriods" AS od

)

SELECT * FROM "maxRecursiveEndDate";

What this actually does it not so important here, I guess. The important
point is: It runs very fast on Postgres 11.6 (like ca 4ms) and much slower
on PostgreSQL 12.1 (ca 150ms). The output of EXPLAIN ANALYZE did not give me
further hints.
A crucial point might or might be not, that are multiple CTEs involved,
including a RECURSIVE one. However, that's speculation.

What I tried out:
- I did try without my_test_function, i.e. putting the values directly into
the first CTE without using a function. This way, there was no problem at
all. Like this, it runs equally fast both on 12.1 and on 11.6.
- On Postgres 12, I played around with MATERIALIZED, but could not see any
effect. The query still runs as slow as before.

Note on reproducibility:
I was able to reproduce the phenomenon on various systems: on multiple VMs
in VirtualBox; via Docker on two different physical machines. (See below for
Docker commands.) However, strange enough, I cannot reproduce it on
https://www.db-fiddle.com/ (no difference to be seen there, both are
fast).

Docker commands:

# First, pull images of both versions

docker pull postgres:12.1
docker pull postgres:11.6

# Now, run Postgres 12

docker run -d --name my_postgres_12_container postgres:12.1

# Now, execute the query

docker exec my_postgres_12_container psql -U postgres -c "

CREATE OR REPLACE FUNCTION public.my_test_function()
RETURNS SETOF record
LANGUAGE sql
IMMUTABLE SECURITY DEFINER
AS \$function\$

SELECT
1::integer AS id,
'2019-11-20'::date AS \"startDate\",
'2020-01-01'::date AS \"endDate\"

\$function\$;

EXPLAIN ANALYZE WITH \"somePeriods\" AS (
SELECT * FROM my_test_function() AS
f(id integer, \"startDate\" date, \"endDate\" date)
),

\"maxRecursiveEndDate\" AS (

SELECT \"startDate\", \"endDate\", id,
(
WITH RECURSIVE prep(\"startDateParam\", \"endDateParam\") AS (

SELECT \"startDate\",\"endDate\" FROM \"somePeriods\" WHERE id = od.id
UNION
SELECT \"startDate\",\"endDate\" FROM \"somePeriods\", prep
WHERE
\"startDate\" <= (\"endDateParam\" + '1 day'::interval ) AND
(\"endDateParam\" + '1 day'::interval ) <= \"endDate\"
)
SELECT max(\"endDateParam\") FROM prep
) AS \"endDateNew\"

FROM \"somePeriods\" AS od

)

SELECT * FROM \"maxRecursiveEndDate\";
"

# Stop the Postgres 12 container

docker stop my_postgres_12_container

# Start Postgres 11 for comparison

docker run -d --name my_postgres_11_container postgres:11.6

# Execute the query in Postgres 11

docker exec my_postgres_11_container psql -U postgres -c "

CREATE OR REPLACE FUNCTION public.my_test_function()
RETURNS SETOF record
LANGUAGE sql
IMMUTABLE SECURITY DEFINER
AS \$function\$

SELECT
1::integer AS id,
'2019-11-20'::date AS \"startDate\",
'2020-01-01'::date AS \"endDate\"

\$function\$;

EXPLAIN ANALYZE WITH \"somePeriods\" AS (
SELECT * FROM my_test_function() AS
f(id integer, \"startDate\" date, \"endDate\" date)
),

\"maxRecursiveEndDate\" AS (

SELECT \"startDate\", \"endDate\", id,
(
WITH RECURSIVE prep(\"startDateParam\", \"endDateParam\") AS (

SELECT \"startDate\",\"endDate\" FROM \"somePeriods\" WHERE id = od.id
UNION
SELECT \"startDate\",\"endDate\" FROM \"somePeriods\", prep
WHERE
\"startDate\" <= (\"endDateParam\" + '1 day'::interval ) AND
(\"endDateParam\" + '1 day'::interval ) <= \"endDate\"
)
SELECT max(\"endDateParam\") FROM prep
) AS \"endDateNew\"

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2020-01-22 06:15:47 BUG #16224: Postgresql - First columnName is where clause has be double brackets?
Previous Message Michael Paquier 2020-01-22 03:37:04 Re: REINDEX CONCURRENTLY unexpectedly fails