Skip site navigation (1) Skip section navigation (2)

Re: Simple machine-killing query!

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: pgsql-performance(at)postgresql(dot)org,Victor Ciurus <vikcious(at)gmail(dot)com>
Subject: Re: Simple machine-killing query!
Date: 2004-10-21 17:14:39
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance

> [explain] select * from BIGMA where string not in (select * from DIRTY);
>                                QUERY PLAN
> ------------------------------------------------------------------------
>  Seq Scan on bigma  (cost=0.00..24582291.25 rows=500 width=145)
>    Filter: (NOT (subplan))
>    SubPlan
>      ->  Seq Scan on dirty  (cost=0.00..42904.63 rows=2503963 width=82)

This is what you call an "evil query".   I'm not surprised it takes forever; 
you're telling the database "Compare every value in 2.7 million rows of text 
against 2.5 million rows of text and give me those that don't match."   There 
is simply no way, on ANY RDBMS, for this query to execute and not eat all of 
your RAM and CPU for a long time.

You're simply going to have to allocate shared_buffers and sort_mem (about 2GB 
of sort_mem would be good) to the query, and turn the computer over to the 
task until it's done.   And, for the sake of sanity, when you find the 
200,000 rows that don't match, flag them so that you don't have to do this 

Josh Berkus
Aglio Database Solutions
San Francisco

In response to

pgsql-performance by date

Next:From: Sean ChittendenDate: 2004-10-21 18:40:08
Subject: Re: Anything to be gained from a 'Postgres Filesystem'?
Previous:From: Victor CiurusDate: 2004-10-21 17:03:41
Subject: Re: Simple machine-killing query!

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group