Query running a lot faster with enable_nestloop=false

From: Mohanaraj Gopala Krishnan <mohangk(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Query running a lot faster with enable_nestloop=false
Date: 2011-10-25 09:09:42
Message-ID: CAK6uQa9wNKmsukj6-StadxJ1HN8sFV6OOZ-q8R5fnmtBDjVhPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi guys,

I have a query that runs a lot slower (~5 minutes) when I run it with
the default enable_nestloop=true and enable_nestloop=false (~10 secs).
The actual query is available here http://pastie.org/2754424 . It is a
reporting query with many joins as the database is mainly used for
transaction processing.

Explain analyse result for both cases:

Machine A nestloop=true - http://explain.depesz.com/s/nkj0 (~5 minutes)
Machine A nestloop=false - http://explain.depesz.com/s/wBM (~10 secs)

On a different slightly slower machine (Machine B), copying the
database over and leaving the default enable_nestloop=true it takes
~20 secs.

Machine B nestloop=true - http://explain.depesz.com/s/dYO (~ 20secs)

For all the cases above I ensured that I did an ANALYZE before running
the queries. There were no other queries running in parallel.
Both machines are running PostgreSQL 8.4.6. Machine B is using the
default configuration provided by the package while for Machine A we
applied the changes suggested by pgtune - http://pastie.org/2755113.

Machine A is running Ubuntu 10.04 32 bit while Machine B is running
Ubuntu 8.04 32 bit.

Machine A spec -
Intel(R) Xeon(R) CPU X3450 @ 2.67GHz (8 Cores)
8GB RAM (2 x 4GB)
4 x 300GB 15k SAS

Machine B spec -
Intel(R) Pentium(R) D CPU 2.80GHz x 2
2GB RAM
1 x 80GB SATA HDD

1. For Machine A, what can I do to make the planner choose the faster
plan without setting enable_nestloop=false ?

2. From the research I have done it seems to be that the reason the
planner is choosing the unoptimal query is because of the huge
difference between the estimated and actual rows. How can I get this
figure closer ?

3. If I should rewrite the query, what should I change ?

4. Why is it that the planner seems to be doing the right thing for
Machine B without setting enable_nestloop=false. What should I be
comparing in both the machines to understand the difference in choice
that the planner made ?

I have tried reading through the manual section "55.1. Row Estimation
Examples", "14.2. Statistics Used by the Planner". I am still trying
to fully apply the information to my specific case above and hence any
help or pointers would be greatly appreciated.

In a last ditch effort we also tried upgrading Machine A to
PostgresSQL 9.1 and that did not rectify the issue. We have reverted
the upgrade for now.

Thank you for your time.

--
Mohan

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Robins Tharakan 2011-10-25 09:41:27 Re: Query running a lot faster with enable_nestloop=false
Previous Message Thilo Raufeisen 2011-10-25 07:48:20 Re: Choosing between Intel 320, Intel 510 or OCZ Vertex 3 SSD for db server