Re: query planner uses sequencial scan instead of index scan

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Daniel Ferreira <daniel(dot)ferreira(at)saba(dot)pt>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: query planner uses sequencial scan instead of index scan
Date: 2009-05-21 15:41:10
Message-ID: 11228.1242920470@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Daniel Ferreira <daniel(dot)ferreira(at)saba(dot)pt> writes:
> has we can see the query planner, decided to do sequencial scan in "a_mov_rcb" table and "recibos", when i set the flag "enable_seqscan" to false all goes well.

It's not really the seqscan that's the problem. The problem is this
rowcount misestimate:

> " -> Hash Join (cost=9016.52..9325.95 rows=1 width=108) (actual time=156.501..304.851 rows=12586 loops=1)"
> " Hash Cond: ((qtd.idasso = app.idasso) AND (qtd.idmov = rct.idmov) AND (qtd.mes = rct.mes))"

which is causing the planner to suppose that the remaining joins should
be done as nestloops. That would be the right thing if there really
were only one row... with twelve thousand of them, it's taking
twelve thousand times longer than the planner expected.

The right fix would be to get the estimate to be better. (Even if it
were 5 or 10 rows the planner would probably avoid the nestloops.)
But I'm not sure how much you can improve it by raising the stats
targets. This is trying to estimate the size of the join between
two GROUP BY subselects, and the planner is not tremendously good
at that.

A brute force solution might go like this:

1. Select the two GROUP BY sub-results into temp tables.

2. ANALYZE the temp tables.

3. Do the original query using the temp tables.

But it's a pain ...

regards, tom lane

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Joshua D. Drake 2009-05-21 15:58:12 Re: raid10 hard disk choice
Previous Message Robert Schnabel 2009-05-21 14:34:46 Re: raid10 hard disk choice