Re: Query too slow with "not in" condition

From: "David Rowley" <dgrowley(at)gmail(dot)com>
To: "'????????? ??'" <masivakumar(at)gmail(dot)com>, <pgsql-general(at)postgresql(dot)org>
Subject: Re: Query too slow with "not in" condition
Date: 2008-11-30 09:58:49
Message-ID: 185DAC59E2914010B0E0A54A71638A68@amd64
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> I have loaded the backup from a live database in a test system. Both run
> 8.3.5 versions. The plan for a query varies in these systems.

> Test System
> A. PostgreSQL 8.3.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2
> 20061115 (prerelease) (SUSE Linux)

> B. explain select * from stock_transaction_detail_106 where transaction_id
> not in (select transaction_id from transaction_value);
> Seq Scan on stock_transaction_detail_106 (cost=1829.78..2867.74
rows=16478 width=128)
> Filter: (NOT (hashed subplan))
> SubPlan
> -> Seq Scan on transaction_value (cost=0.00..1598.02 rows=92702
width=4)

> The query takes about 300 ms to run.

> Production System

> 1. PostgreSQL 8.3.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.2.1
(SUSE Linux)

> 2. explain select * from stock_transaction_detail_106 where transaction_id
not in (select transaction_id from transaction_value);
> Seq Scan on stock_transaction_detail_106 (cost=2153.95..25245478.39
rows=17064 width=122)
> Filter: (NOT (subplan))
> SubPlan
> -> Materialize (cost=2153.95..3401.01 rows=92905 width=4)
> -> Seq Scan on transaction_value (cost=0.00..1743.05
rows=92905 width=4)

> Here the query did not return any results after 1hour.

> In both the computers same query with in condition runs fast (520 ms and
290 ms respectively)

> Please help me to resolve this issue. (One configuration difference
> between these machines are pg_hba.conf file. In production machine it is
> password enabled. In test machine it is trust mode.)

You might find this page interesting:
http://www.depesz.com/index.php/2008/08/13/nulls-vs-not-in/

I assume workmem, effective_cache_size and random_page_cost are all the same
in the 2 postgresql.conf?

Do you get the same after you ANALYZE the database?

If you read the page above you might realise NOT IN is not really what you
want. Maybe NOT EXISTS or a LEFT OUTER JOIN ... WHERE transaction_id IS NULL
?

David.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andrus 2008-11-30 10:17:58 Re: db backup script in gentoo
Previous Message சிவகுமார் மா 2008-11-30 09:19:18 Query too slow with "not in" condition