"henk de wit" <henk53602(at)hotmail(dot)com> writes:
>> --- the thing is evidently assuming
>> that only about 1 in 200 rows have status = '0', which might be accurate
>> as a global average but not for this particular merchant. What exactly
>> is the relationship between status and merchant_id, anyway?
> The meaning is that a "banners_link" belongs to a merchant with the id
> merchant_id. A "banners_link" can be disabled (status 1) or enabled (status
> 0). Globally about 1/3 of the banners_links have status 0 and 2/3 have
> status 1. The 1 in 200 estimate is indeed way off.
Well, that's darn odd. It should not be getting that so far wrong.
What's the datatype of the status column exactly (I'm guessing varchar
but maybe not)? Would you show us the pg_stats row for the status column?
> One interesting other thing to note; if I remove the banners_links.status =
> 0 condition from the query altogether the execution times improve
> dramatically again.
Right, because it was dead on about how many merchant_id = 5631 rows
there are. The estimation error is creeping in where it guesses how
selective the status filter is. It should be using the global fraction
of status = 0 rows for that, but it seems to be using a default estimate
instead (1/200 is actually the default eqsel estimate now that I think
about it). I'm not sure why, but I think it must have something to do
with the subquery structure of your query. Were you showing us the
whole truth about your query, or were there details you left out?
regards, tom lane
In response to
pgsql-performance by date
|Next:||From: Tom Lane||Date: 2007-04-23 04:36:01|
|Subject: Re: postgres: 100% CPU utilization |
|Previous:||From: henk de wit||Date: 2007-04-22 22:58:15|
|Subject: Re: Redundant sub query triggers slow nested loop left join|