Re: optimizing large query with IN (...)

From: Steve Atkins <steve(at)blighty(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: optimizing large query with IN (...)
Date: 2004-03-10 14:42:54
Message-ID: 20040310144253.GA31063@gp.word-to-the-wise.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Mar 10, 2004 at 12:35:15AM -0300, Marcus Andree S. Magalhaes wrote:
> Guys,
>
> I got a Java program to tune. It connects to a 7.4.1 postgresql server
> running Linux using JDBC.
>
> The program needs to update a counter on a somewhat large number of
> rows, about 1200 on a ~130k rows table. The query is something like
> the following:
>
> UPDATE table SET table.par = table.par + 1
> WHERE table.key IN ('value1', 'value2', ... , 'value1200' )
>
> This query runs on a transaction (by issuing a call to
> setAutoCommit(false)) and a commit() right after the query
> is sent to the backend.
>
> The process of committing and updating the values is painfully slow
> (no surprises here). Any ideas?

I posted an analysis of use of IN () like this a few weeks ago on
pgsql-general.

The approach you're using is optimal for < 3 values.

For any more than that, insert value1 ... value1200 into a temporary
table, then do

UPDATE table SET table.par = table.par + 1
WHERE table.key IN (SELECT value from temp_table);

Indexing the temporary table marginally increases the speed, but not
significantly.

Cheers,
Steve

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2004-03-10 15:24:00 Re: Cluster failure due to space
Previous Message Shea,Dan [CIS] 2004-03-10 13:56:57 Cluster failure due to space