Re: Bad plan by Planner (Already resolved?)

From: Robins Tharakan <robins(dot)tharakan(at)comodo(dot)com>
To: PGSQL Performance List <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Bad plan by Planner (Already resolved?)
Date: 2011-10-18 05:57:57
Message-ID: 4E9D1565.3020403@comodo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

I'll try to answer in-line.

On 10/17/2011 09:32 PM, Kevin Grittner wrote:
> First off, did you use pg_upgrade from an earlier major release? If
> so, be sure you've dealt with this issue:
Although I joined recently, I doubt whether pg_upgrade was used here.
And this doesn't look like the issue either. There are no data loss
issues and this seems primarily a planner specific bug.

> the description is a
> little vague without table definitions and EXPLAIN ANALYZE output,
> so people might just not be sure.
Makes sense. Just that, I thought I shouldn't drop in a large mail, in
case the issue was a well-known one. Please find below the EXPLAIN
ANALYSE output. I've changed the table-names / field-names and provided
other details as well.

large_table_a: ~20million
n_dead_tuples / reltuples : ~7%
analysed: <2 weeks

large_table_b: ~140million
n_dead_tuples / reltuples : ~0%
analysed: <2 days

default_statistics_target: 1000

field_a: int (indexed)
field_b: int (indexed)

> Since it's arguably in your best
> interest to update at least to 8.4.9 anyway, the easiest way to get
> your answer might be to do so and test it.
Frankly, its slightly difficult to just try out versions. DB>1Tb and
getting that kind of resources to just try out versions for a query is
not that simple. Hope you would understand. I have the workaround
implemented, but just wanted to be sure that this is accommodated in a
newer version.

===============
EXISTING QUERY:
SELECT field_a FROM large_table_a
JOIN large_table_b USING (field_a)
WHERE field_b IN (SELECT large_table_b.field_b
FROM large_table_b WHERE field_a = 2673056)

ANALYSE:
Hash Join (cost=273247.23..6460088.89 rows=142564896 width=4)
Hash Cond: (public.large_table_b.field_b =
public.large_table_b.field_b)
-> Merge Join (cost=273112.62..5925331.24 rows=142564896 width=8)
Merge Cond: (large_table_a.field_a =
public.large_table_b.field_a)
-> Index Scan using "PK_large_table_a" on large_table_a
(cost=0.00..570804.30 rows=22935395 width=4)
-> Index Scan using "IX_large_table_b_field_a" on
large_table_b (cost=0.00..4381499.54 rows=142564896 width=8)
-> Hash (cost=133.32..133.32 rows=103 width=4)
-> HashAggregate (cost=132.29..133.32 rows=103 width=4)
-> Index Scan using "IX_large_table_b_field_a" on
large_table_b (cost=0.00..131.87 rows=165 width=4)
Index Cond: (field_a = 2673056)

=====================

ALTERNATE QUERY:
SELECT s1.field_a FROM large_table_a
JOIN large_table_b s1 USING (field_a)
JOIN large_table_b s2 ON s1.field_b = s2.field_b
WHERE s2.field_a = 2673056

ANALYSE:
Nested Loop (cost=0.00..2368.74 rows=469 width=4) (actual
time=0.090..0.549 rows=6 loops=1)
-> Nested Loop (cost=0.00..1784.06 rows=469 width=4) (actual
time=0.057..0.350 rows=16 loops=1)
-> Index Scan using "IX_large_table_b_field_a" on
large_table_b s2 (cost=0.00..131.87 rows=165 width=4) (actual
time=0.033..0.046 rows=6 loops=1)
Index Cond: (field_a = 2673056)
-> Index Scan using "IX_large_table_b_SampleId" on
large_table_b s1 (cost=0.00..9.99 rows=2 width=8) (actual
time=0.037..0.047 rows=3 loops=6)
Index Cond: (s1.field_b = s2.field_b)
-> Index Scan using "PK_large_table_a" on large_table_a
(cost=0.00..1.23 rows=1 width=4) (actual time=0.011..0.011 rows=0 loops=16)
Index Cond: (large_table_a.field_a = s1.field_a)
Total runtime: 0.620 ms

--
Robins Tharakan

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Kirkwood 2011-10-18 06:18:13 Re: Bad plan by Planner (Already resolved?)
Previous Message Cédric Villemain 2011-10-17 20:31:13 Re: Optimize the database performance