Quick Links

Performance and IN clauses

From:	"Kynn Jones" <kynnjo(at)gmail(dot)com>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Performance and IN clauses
Date:	2008-11-18 15:53:19
Message-ID:	c2350ba40811180753m2c6b698csdf180047e0fa621f@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Hi. I have a Perl script whose main loop generates thousands of SQL updates
of the form
UPDATE edge SET keep = true WHERE node1 IN ( $node_list ) AND node2 =
$node_id;

...where here $node_list stands for a comma-separated list of integers, and
$node_id stands for some integer.

The list represented by $node_list can be fairly long (on average it has
around 900 entries, and can be as long as 30K entries), and I'm concerned
about the performance cost of testing for inclusion in such a long list. Is
this done by a sequential search? If so, is there a better way to write
this query? (FWIW, I have two indexes on the edge table using btree( node1
) and btree( node2 ), respectively.)

Also, assuming that the optimal way to write the query depends on the length
of $node_list, how can I estimate the "critical length" at which I should
switch from one form of the query to the other?

TIA!

Kynn

Responses

Re: Performance and IN clauses at 2008-11-18 16:12:24 from Matthew Wakeling

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Matthew Wakeling	2008-11-18 16:12:24	Re: Performance and IN clauses
Previous Message	Dave Page	2008-11-17 20:47:48	Re: Bad performance on simple query