Quick Links

Re: optimizing large query with IN (...)

From:	Steve Atkins <steve(at)blighty(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: optimizing large query with IN (...)
Date:	2004-03-10 14:42:54
Message-ID:	20040310144253.GA31063@gp.word-to-the-wise.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

On Wed, Mar 10, 2004 at 12:35:15AM -0300, Marcus Andree S. Magalhaes wrote:
> Guys,
>
> I got a Java program to tune. It connects to a 7.4.1 postgresql server
> running Linux using JDBC.
>
> The program needs to update a counter on a somewhat large number of
> rows, about 1200 on a ~130k rows table. The query is something like
> the following:
>
> UPDATE table SET table.par = table.par + 1
> WHERE table.key IN ('value1', 'value2', ... , 'value1200' )
>
> This query runs on a transaction (by issuing a call to
> setAutoCommit(false)) and a commit() right after the query
> is sent to the backend.
>
> The process of committing and updating the values is painfully slow
> (no surprises here). Any ideas?

I posted an analysis of use of IN () like this a few weeks ago on
pgsql-general.

The approach you're using is optimal for < 3 values.

For any more than that, insert value1 ... value1200 into a temporary
table, then do

UPDATE table SET table.par = table.par + 1
WHERE table.key IN (SELECT value from temp_table);

Indexing the temporary table marginally increases the speed, but not
significantly.

Cheers,
Steve

In response to

optimizing large query with IN (...) at 2004-03-10 03:35:15 from Marcus Andree S. Magalhaes

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tom Lane	2004-03-10 15:24:00	Re: Cluster failure due to space
Previous Message	Shea,Dan [CIS]	2004-03-10 13:56:57	Cluster failure due to space