| From: | "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com> |
|---|---|
| To: | "Marcus Andree S(dot) Magalhaes" <marcus(dot)magalhaes(at)vlinfo(dot)com(dot)br> |
| Cc: | <pgsql-performance(at)postgresql(dot)org> |
| Subject: | Re: optimizing large query with IN (...) |
| Date: | 2004-03-10 16:47:09 |
| Message-ID: | Pine.LNX.4.33.0403100944240.10418-100000@css120.ihs.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-performance |
On Wed, 10 Mar 2004, Marcus Andree S. Magalhaes wrote:
>
> Guys,
>
> I got a Java program to tune. It connects to a 7.4.1 postgresql server
> running Linux using JDBC.
>
> The program needs to update a counter on a somewhat large number of
> rows, about 1200 on a ~130k rows table. The query is something like
> the following:
>
> UPDATE table SET table.par = table.par + 1
> WHERE table.key IN ('value1', 'value2', ... , 'value1200' )
>
> This query runs on a transaction (by issuing a call to
> setAutoCommit(false)) and a commit() right after the query
> is sent to the backend.
>
> The process of committing and updating the values is painfully slow
> (no surprises here). Any ideas?
The problem, as I understand it, is that 7.4 introduced massive
improvements in handling moderately large in() clauses, as long as they
can fit in sort_mem, and are provided by a subselect.
So, creating a temp table with all the values in it and using in() on the
temp table may be a win:
begin;
create temp table t_ids(id int);
insert into t_ids(id) values (123); <- repeat a few hundred times
select * from maintable where id in (select id from t_ids);
...
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2004-03-10 16:53:51 | Re: syslog slowing the database? |
| Previous Message | Andrew Sullivan | 2004-03-10 16:07:28 | Re: compiling 7.4.1 on Solaris 9 |