PG 9.1 performance loss due to query plan being changed depending on db data (4s vs 200ms)

From: Rodrigo Rosenfeld Rosas <rr(dot)rosas(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: PG 9.1 performance loss due to query plan being changed depending on db data (4s vs 200ms)
Date: 2012-12-19 19:35:57
Message-ID: 50D2171D.9010608@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Trying again since I couldn't post without being subscribed. The message
got stalled and was never sent, so I just subscribed and I'm trying
again. Original message I tried to send two days ago:

----------------------------------

I've explained a bit of how my application works in this thread as well
as the reasons why I couldn't use PG 9.2.1 due to performance issues and
had to rollback to 9.1.

http://postgresql.1045698.n5.nabble.com/Query-completed-in-lt-1s-in-PG-9-1-and-700s-in-PG-9-2-td5730899.html

Now I found that 9.2.2 was released while 9.1 is performing worse for
the same db schema, but the data is now different.

So, here are the output of some explain analyze:

Old DB dump, PG 9.1: http://explain.depesz.com/s/mvf (0.2s)
New DB dump, PG 9.1: http://explain.depesz.com/s/vT2k (4.3s)
New DB dump, PG 9.2.2: http://explain.depesz.com/s/uu0 (0.04s)

I've already upgraded back to PG 9.2.2 but I thought you might be
interested on backporting that improvement to 9.1 as well and I'm not
even sure if the bug above was fixed intentionally or by chance so I'd
like to be sure about that...

The query I used was:

SELECT t.id as tid,
t.acquiror_company_name || ' / ' || t.target_company_name as tname,
exists(select id from condition_document_excerpt where
condition_id=c1726.id) as v1726_has_reference,
l1726.value as v1726
FROM company_transaction t
left outer join
condition_option_value v1726
inner join transaction_condition c1726
on c1726.id=v1726.condition_id and type_id=1726
inner join condition_option_label l1726
on l1726.id=v1726.value_id
on c1726.transaction_id = t.id
WHERE t.edit_status = 'Finished' and
(t.id in (select transaction_id from
condition_option_value v1726
inner join transaction_condition c1726
on c1726.id=v1726.condition_id and type_id=1726
inner join condition_option_label l1726
on l1726.id=v1726.value_id
AND (v1726.value_id = 278)
)
)
ORDER BY
t.acquiror_company_name, t.target_company_name

If I simplify the WHERE condition it performs much better in 9.1 for
this particular case (but I can't do that as the queries are generated
dynamically, please see first mentioned link to understand the reason):

WHERE t.edit_status = 'Finished' and v1726.value_id = 278

New DB dump, 9.1, simplified query: http://explain.depesz.com/s/oj1 (0.03s)

The inner query (for the "in" clause) alone takes 44ms:

select transaction_id from
condition_option_value v1726
inner join transaction_condition c1726
on c1726.id=v1726.condition_id and type_id=1726
inner join condition_option_label l1726
on l1726.id=v1726.value_id
AND (v1726.value_id = 278)

So, what would be the reason for the full original query to take over 4s
in PG 9.1?

Best,

Rodrigo.

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Neill 2012-12-19 21:13:06 Why does the query planner use two full indexes, when a dedicated partial index exists?
Previous Message Igor Neyman 2012-12-19 19:13:42 Re: How can i find out top high load sql queries in PostgreSQL.