Re: 7k records into Sort node, 4.5m out?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: 7k records into Sort node, 4.5m out?
Date: 2012-08-16 06:01:51
Message-ID: 2703.1345096911@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Josh Berkus <josh(at)agliodbs(dot)com> writes:
> SELECT COUNT(*)
> FROM "user"
> INNER JOIN "house"
> ON ("user"."house_id" = "house"."id")
> LEFT OUTER JOIN "district"
> ON ("house"."district_id" = "district"."id")
> WHERE ("user"."status" = 0
> AND ("district"."update_status" = 2
> OR "district"."update_status" = 3 )
> AND ("user"."valid" = 1
> OR "user"."valid" = 3 )
> AND "district"."is_test" = false );

> However, since the anonymization above doesn't quite match that used in
> the EXPLAIN plan, I'm not sure what you'll get out of it. And yes, we
> know that the outer join is being invalidated.

Ah, I see where I was confused: in the original query plan I'd been
imagining that charlie.sierra was a unique column, but your gloss on
that as being house.district_id implies that it's highly non-unique.
And looking at the rowcounts in the original plan backs that up:
there are about 600 house rows per district row. So my thought of
having district as the outer side of a nestloop scanning the index
on house.district_id would not really work very well --- maybe it
would end up cheaper than the mergejoin plan, but it's far from a
clear-cut win.

On the whole I'm thinking the code is operating as designed here.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Bruce Momjian 2012-08-16 14:53:21 Re: High Disk write and space taken by PostgreSQL
Previous Message Claudio Freire 2012-08-16 05:50:46 Re: High Disk write and space taken by PostgreSQL