Re: A problem with the IN clause

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Sean Shanny <shannyconsulting(at)earthlink(dot)net>
Cc: pgsql-general(at)postgresql(dot)org, Nick Shanny <nshanny(at)tripadvisor(dot)com>
Subject: Re: A problem with the IN clause
Date: 2004-05-19 18:09:48
Message-ID: 18900.1084990188@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Sean Shanny <shannyconsulting(at)earthlink(dot)net> writes:
> When I run this against our warehouse instance I get an out of memory
> error. If I remove the
> AND t1.newsletterid_key IN (SELECT newsletterid FROM t_newscontentstatic)
> portion if runs fine.

I think the problem is not there at all, but with drastic
underestimation of the number of rows coming from f_pageviews:

> -> Seq Scan on f_pageviews t1
> (cost=0.00..585486.72 rows=1 width=24) (actual
> time=60502.415..-463715.543 rows=24422838 loops=1)
> Filter: ((date_key >= 496) AND
> (date_key <= 502))

The plan you say is failing is trying to load this result into a
hashtable ... and since it's only expecting 1 row, it's not going
to try to partition the hashtable or anything like that.

Are your ANALYZE stats for f_pageviews up to date? Perhaps you need to
increase the stats target for date_key to get more resolution in the
stats.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Mark Harrison 2004-05-19 18:13:05 asynchronous query example in C?
Previous Message Sam Masiello 2004-05-19 17:12:14 Dblink question