From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Thomas Kappler <tkappler(at)googlemail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Slow query with self-join, group by, 100m rows |
Date: | 2011-09-21 15:02:42 |
Message-ID: | 23039.1316617362@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Thomas Kappler <tkappler(at)googlemail(dot)com> writes:
> The query we want to do is (with example values):
> select t.cid, count(distinct t1.subject_id)
> from termindex t1, termindex t2
> where t1.cid=20642 and t1.indextype=636 and t2.indextype=636 and
> t2.subject_id=t1.subject_id
> group by t2.cid;
The EXPLAIN output you provided doesn't appear to match this query (in
particular, I don't see the indextype restrictions being checked
anyplace in the plan).
One quick-and-dirty thing that might help is to raise work_mem enough
so that (1) you get a hash aggregation not a sort/group one, and (2)
if there are still any sorts being done, they don't spill to disk.
That will probably be a higher number than would be prudent to install
as a global setting, but you could SET it locally in the current
session before issuing the expensive query.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Smith | 2011-09-21 16:12:50 | Re: Show_database_bloat reliability? [was: Re: REINDEX not working for wastedspace] |
Previous Message | Shaun Thomas | 2011-09-21 14:55:38 | Re: REINDEX not working for wastedspace |