performance regression in 9.2/9.3

From: Linos <info(at)linos(dot)es>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Merlin Moncure <mmoncure(at)gmail(dot)com>
Subject: performance regression in 9.2/9.3
Date: 2014-06-05 11:32:27
Message-ID: 5390554B.5080503@linos.es
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello all,

This is a continuation of the thread found here:
http://www.postgresql.org/message-id/538F2578.9080001@linos.es

Considering this seems to be a problem with the planner I thought that maybe would be a better idea to post this problem here.

To summarize the original thread I upgraded a medium (17Gb) database from PostgreSQL 8.4 to 9.3 and many of the queries my application uses started performing a lot slower, Merlin advised me to try disabling nestloop, this helped out for the particular query I was asking about but it is not a solution that I "can/would like" to use in the general case.

I simplified a little bit the original query and I have added another one with same problem.

query 1:
http://pastebin.com/32QxbNqW

query 1 postgres 9.3 nestloop enabled:
http://explain.depesz.com/s/6WX

query 1 postgres 8.4:
http://explain.depesz.com/s/Q7V

query 1 postgres 9.3 nestloop disabled:
http://explain.depesz.com/s/w1n

query 1 postgres 9.3 changed "having min(ts_recepcion) =" for "where ts_recepcion = "
http://explain.depesz.com/s/H5V

query 2:
http://pastebin.com/JmfPcRg8

query 2 postgres 9.3 nestloop enabled:
http://explain.depesz.com/s/EY7

query 2 postgres 8.4:
http://explain.depesz.com/s/Xc4

query 2 postgres 9.3 nestloop disabled:
http://explain.depesz.com/s/oO6O

query 2 postgres 9.3 changed "between" to "equal" for date filter:
http://explain.depesz.com/s/cP2H

As you can see in this links the problem disappears when I disable nestloop, another thing I discovered making different combinations of changes is that it seems to be related with date/timestamp fields, small changes to the queries fix the problem without disabling nestloop.

For example in query 1 changing this:
WHERE cab.id_almacen_destino = 109
GROUP BY mo.modelo_id
HAVING MIN(cab.time_stamp_recepcion)::date = (current_date - interval '30 days')::date

to this:
WHERE cab.id_almacen_destino = 109
AND cab.time_stamp_recepcion::date = (current_date - interval '30 days')::date
GROUP BY mo.modelo_id

in the first subquery fixed the execution time problem, I know the result is not the same, the second change is a better example:

In query2 changing this:
WHERE fecha BETWEEN '2014-05-19' AND '2014-05-19'
to this:
WHERE fecha = '2014-05-19'

fixes the problem, as you can see in the different explains.

This changes are not needed to make PostgreSQL 8.4 take the correct plan but they are in 9.2/9.3, I haven't tried 9.1 or 9.0 yet.

Merlin advised me to create a small test case, the thing is that the tables involved can be pretty large. The best way to create a good test case would be to use generate_series or something alike to try to replicate this problem from zero without any dump, no?

Regards,
Miguel Angel.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2014-06-05 11:43:48 Re: slotname vs slot_name
Previous Message Andres Freund 2014-06-05 11:18:19 Re: slotname vs slot_name