Re: 8.1 -- very slow query time because of "BETWEEN" (dbmail)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Brian Neu <proclivity76(at)yahoo(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: 8.1 -- very slow query time because of "BETWEEN" (dbmail)
Date: 2007-06-22 03:33:17
Message-ID: 11548.1182483197@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Brian Neu <proclivity76(at)yahoo(dot)com> writes:
> Ahhh. I knew troubleshooting this would lead to cool new discoveries and troubleshooting tools. I apologize if Yahoo jacks the formatting up:

It's still readable ... seems the core of the problem is here:

> " -> Bitmap Heap Scan on dbmail_headervalue v (cost=84.09..15910.01 rows=7454 width=48) (actual time=13.653..13.678 rows=17 loops=1)"
> " Recheck Cond: (v.physmessage_id = "outer".physmessage_id)"
> " -> Bitmap Index Scan on dbmail_headervalue_1 (cost=0.00..84.09 rows=7454 width=0) (actual time=13.589..13.589 rows=17 loops=1)"
> " Index Cond: (v.physmessage_id = "outer".physmessage_id)"

In the slow case, the planner estimates it would have to do this scan 3
times not just once, when once is correct. (This is because range
estimation is a bit fuzzier than equality estimation. Estimating 3
matching rows instead of 1 is still well within reasonable error
though.) The problem is that it's estimating 7454 matching
dbmail_headervalue rows per outer row, when the truth is only 17; and
that results in a large overestimate of the cost of doing this scan,
which convinces it that it doesn't want to do it more than once.

So basically the trick here is to get that 7454 number closer to
reality. Has this table been ANALYZEd lately? If so, could we
see the pg_stats entry for the physmessage_id column?

regards, tom lane

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message saumya goel 2007-06-22 05:55:38 how to use variables in postgresql
Previous Message Brian Neu 2007-06-22 03:05:10 Re: 8.1 -- very slow query time because of "BETWEEN" (dbmail)