From: | "Dawid Kuroczko" <qnex42(at)gmail(dot)com> |
---|---|
To: | PFC <lists(at)peufeu(dot)com> |
Cc: | "Christian Kratzer" <ck(at)cksoft(dot)de>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-performance(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: [HACKERS] Big IN() clauses etc : feature proposal |
Date: | 2006-05-09 16:43:23 |
Message-ID: | 758d5e7f0605090943p1bbfdae2p4cb28ba128288316@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
On 5/9/06, PFC <lists(at)peufeu(dot)com> wrote:
> > You might consider just selecting your primary key or a set of
> > primary keys to involved relations in your search query. If you
> > currently use "select *" this can make your result set very large.
> >
> > Copying all the result set to the temp. costs you additional IO
> > that you propably dont need.
>
> It is a bit of a catch : I need this information, because the purpose of
> the query is to retrieve these objects. I can first store the ids, then
> retrieve the objects, but it's one more query.
>
> > Also you might try:
> > SELECT * FROM somewhere JOIN result USING (id)
> > Instead of:
> > SELECT * FROM somewhere WHERE id IN (SELECT id FROM result)
>
> Yes you're right in this case ; however the query to retrieve the owners
> needs to eliminate duplicates, which IN() does.
Well, you can either
SELECT * FROM somewhere JOIN (SELECT id FROM result GROUP BY id) AS
a USING (id);
or even, for large number of ids:
CREATE TEMPORARY TABLE result_ids AS SELECT id FROM RESULT GROUP BY id;
SELECT * FROM somewhere JOIN result_ids USING (id);
> > On the other hand if your search query runs in 10ms it seems to be fast
> > enough for you to run it multiple times. Theres propably no point in
> > optimizing anything in such case.
>
> I don't think so :
> - 10 ms is a mean time, sometimes it can take much more time, sometimes
> it's faster.
> - Repeating the query might yield different results if records were added
> or deleted in the meantime.
You may SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
though locking might bite you. :)
> - Complex search queries have imprecise rowcount estimates ; hence the
> joins that I would add to them will get suboptimal plans.
>
> Using a temp table is really the cleanest solution now ; but it's too
> slow so I reverted to generating big IN() clauses in the application.
A thought, haven't checked it though, but...
You might want to use PL to store values, say PLperl, or even C, say:
create or replace function perl_store(name text, val int) returns void
as $$ my $name = shift; push @{$foo{$name}}, shift; return $$ LANGUAGE
plperl;
select perl_store('someids', id) from something group by id;
(you may need to warp it inside count())
Then use it:
create or replace function perl_retr(name text) returns setof int as
$$ my $name = shift; return $foo{$name} $$ LANGUAGE plperl;
select * from someother join perl_retr('someids') AS a(id) using (id);
All is in the memory. Of course, you need to do some cleanup, test it,
etc, etc, etc. :)
Should work faster than a in-application solution :)
Regards,
Dawid
From | Date | Subject | |
---|---|---|---|
Next Message | PFC | 2006-05-09 16:49:22 | Re: [HACKERS] Big IN() clauses etc : feature proposal |
Previous Message | PFC | 2006-05-09 16:38:51 | Re: [HACKERS] Big IN() clauses etc : feature proposal |
From | Date | Subject | |
---|---|---|---|
Next Message | PFC | 2006-05-09 16:49:22 | Re: [HACKERS] Big IN() clauses etc : feature proposal |
Previous Message | PFC | 2006-05-09 16:38:51 | Re: [HACKERS] Big IN() clauses etc : feature proposal |