Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2

From: Rodrigo Rosenfeld Rosas <rr(dot)rosas(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2
Date: 2012-11-06 17:11:58
Message-ID: 509944DE.6020400@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello, this is my first message to this list, so sorry if this is not
the right place to discuss this or if some data is missing from this
message.

I'll gladly send any data you request that would help us to understand
this issue. I don't believe I'm allowed to share the actual database
dump, but other than that I can provide much more details you might ask for.

I can't understand why PG 9.2 performs so differently from PG 9.1.

I tested these queries in my Debian unstable amd64 box after restoring
the same database dump this morning in both PG 9.1 (Debian unstable
repository) and PG9.2 (Debian experimental repository) with same settings:

https://gist.github.com/3f1f3aad3847155e1e35

Ignore all lines like the line below because it doesn't make any
difference on my tests if I just remove them or any other column from
the SELECT clause:

" exists(select id from condition_document_excerpt where
condition_id=c1686.id) as v1686_has_reference,"

The results below are pretty much the same if you assume "SELECT 1 FROM
...".

I have proper indices created for all tables and the query is fast in
both PG versions when I don't use many conditions in the WHERE clause.

fast.sql returns the same data as slow.sql but it returns much faster in
my tests with PG 9.1.

So here are the completion times for each query on each PG version:

Query | PG 9.1 | PG 9.2 |
-----------------------------------
fast.sql| 650 ms (0.65s) | 690s |
slow.sql| 419s | 111s |

For the curious, the results would be very similar to slow.sql if I use
inner joins with the conditions inside the WHERE moved to the "ON"
clause of the inner join instead of the left outer join + global WHERE
approach. But I don't have this option anyway because this query is
generated dynamically and not all my queries are "ALL"-like queries.

Here are the relevant indices (id is SERIAL primary key in all tables):

CREATE UNIQUE INDEX transaction_condition_transaction_id_type_id_idx
ON transaction_condition
USING btree
(transaction_id, type_id);
CREATE INDEX index_transaction_condition_on_transaction_id
ON transaction_condition
USING btree
(transaction_id);
CREATE INDEX index_transaction_condition_on_type_id
ON transaction_condition
USING btree
(type_id);

CREATE INDEX acquirer_target_names
ON company_transaction
USING btree
(acquiror_company_name COLLATE pg_catalog."default",
target_company_name COLLATE pg_catalog."default");
CREATE INDEX index_company_transaction_on_target_company_name
ON company_transaction
USING btree
(target_company_name COLLATE pg_catalog."default");
CREATE INDEX index_company_transaction_on_date
ON company_transaction
USING btree
(date);
CREATE INDEX index_company_transaction_on_edit_status
ON company_transaction
USING btree
(edit_status COLLATE pg_catalog."default");

CREATE UNIQUE INDEX index_condition_boolean_value_on_condition_id
ON condition_boolean_value
USING btree
(condition_id);
CREATE INDEX index_condition_boolean_value_on_value_and_condition_id
ON condition_boolean_value
USING btree
(value COLLATE pg_catalog."default", condition_id);

CREATE UNIQUE INDEX index_condition_option_value_on_condition_id
ON condition_option_value
USING btree
(condition_id);
CREATE INDEX index_condition_option_value_on_value_id_and_condition_id
ON condition_option_value
USING btree
(value_id, condition_id);

CREATE INDEX index_condition_option_label_on_type_id_and_position
ON condition_option_label
USING btree
(type_id, "position");
CREATE INDEX index_condition_option_label_on_type_id_and_value
ON condition_option_label
USING btree
(type_id, value COLLATE pg_catalog."default");

CREATE UNIQUE INDEX index_condition_string_value_on_condition_id
ON condition_string_value
USING btree
(condition_id);
CREATE INDEX index_condition_string_value_on_value_and_condition_id
ON condition_string_value
USING btree
(value COLLATE pg_catalog."default", condition_id);

Please let me know of any suggestions on how to try to get similar
results in PG 9.2 as well as to understand why fast.sql performs so much
better than slow.sql on PG 9.1.

Best,
Rodrigo.

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Merlin Moncure 2012-11-06 17:22:12 Re: Query completed in < 1s in PG 9.1 and ~ 700s in PG 9.2
Previous Message Denis 2012-11-06 15:40:21 Re: [HACKERS] pg_dump and thousands of schemas