From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Josh Berkus <josh(at)agliodbs(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: 7k records into Sort node, 4.5m out? |
Date: | 2012-08-16 06:01:51 |
Message-ID: | 2703.1345096911@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Josh Berkus <josh(at)agliodbs(dot)com> writes:
> SELECT COUNT(*)
> FROM "user"
> INNER JOIN "house"
> ON ("user"."house_id" = "house"."id")
> LEFT OUTER JOIN "district"
> ON ("house"."district_id" = "district"."id")
> WHERE ("user"."status" = 0
> AND ("district"."update_status" = 2
> OR "district"."update_status" = 3 )
> AND ("user"."valid" = 1
> OR "user"."valid" = 3 )
> AND "district"."is_test" = false );
> However, since the anonymization above doesn't quite match that used in
> the EXPLAIN plan, I'm not sure what you'll get out of it. And yes, we
> know that the outer join is being invalidated.
Ah, I see where I was confused: in the original query plan I'd been
imagining that charlie.sierra was a unique column, but your gloss on
that as being house.district_id implies that it's highly non-unique.
And looking at the rowcounts in the original plan backs that up:
there are about 600 house rows per district row. So my thought of
having district as the outer side of a nestloop scanning the index
on house.district_id would not really work very well --- maybe it
would end up cheaper than the mergejoin plan, but it's far from a
clear-cut win.
On the whole I'm thinking the code is operating as designed here.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2012-08-16 14:53:21 | Re: High Disk write and space taken by PostgreSQL |
Previous Message | Claudio Freire | 2012-08-16 05:50:46 | Re: High Disk write and space taken by PostgreSQL |