Quick Links

Re: [HACKERS] Big IN() clauses etc : feature proposal

From:	PFC <lists(at)peufeu(dot)com>
To:	"Christian Kratzer" <ck(at)cksoft(dot)de>
Cc:	pgsql-performance(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: [HACKERS] Big IN() clauses etc : feature proposal
Date:	2006-05-09 10:10:37
Message-ID:	op.s893vzcpcigqcu@apollo13
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-performance

> Additionally to your query you are already transferring the whole result
> set multiple times. First you copy it to the result table. Then you
> read it again. Your subsequent queries will also have to read over
> all the unneeded tuples just to get your primary key.

Considering that the result set is not very large and will be cached in
RAM, this shouldn't be a problem.

> then why useth thy not the DISTINCT clause when building thy result
> table and thou shalt have no duplicates.

Because the result table contains no duplicates ;)
I need to remove duplicates in this type of queries :

-- get object owners info
SELECT * FROM users WHERE id IN (SELECT user_id FROM results);

And in this case I find IN() easier to read than DISTINCT (what I posted
was a simplification of my real use case...)

> which is a perfect reason to use a temp table. Another variation on the
> temp table scheme is use a result table and add a query_id.

True. Doesn't solve my problem though : it's still complex, doesn't have
good rowcount estimation, bloats a table (I only need these records for
the duration of the transaction), etc.

> We do something like this in our web application when users submit
> complex queries. For each query we store tuples of (query_id,result_id)
> in a result table. It's then easy for the web application to page the
> result set.

Yes, that is about the only sane way to page big result sets.

> A cleaner solution usually pays off in the long run whereas a hackish
> or overly complex solution will bite you in the behind for sure as
> time goes by.

Yes, but in this case temp tables add too much overhead. I wish there
were RAM based temp tables like in mysql. However I guess the current temp
table slowness comes from the need to mark their existence in the system
catalogs or something. That's why I proposed using cursors...

In response to

Re: Big IN() clauses etc : feature proposal at 2006-05-09 09:41:59 from Christian Kratzer

Responses

Re: [HACKERS] Big IN() clauses etc : feature proposal at 2006-05-09 10:36:32 from Martijn van Oosterhout

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Martijn van Oosterhout	2006-05-09 10:36:32	Re: [HACKERS] Big IN() clauses etc : feature proposal
Previous Message	Christian Kratzer	2006-05-09 09:41:59	Re: Big IN() clauses etc : feature proposal

Browse pgsql-performance by date

	From	Date	Subject
Next Message	blender	2006-05-09 10:19:08	PostgreSQL VACCUM killing CPU
Previous Message	Jean-Yves F. Barbier	2006-05-09 10:10:32	Re: Arguments Pro/Contra Software Raid