Re: Order by (for 15 rows) adds 30 seconds to query time

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-performance(at)postgresql(dot)org, matthew(at)flymine(dot)org, jmpoure(at)free(dot)fr, craig(at)postnewspapers(dot)com(dot)au
Subject: Re: Order by (for 15 rows) adds 30 seconds to query time
Date: 2009-12-27 21:52:31
Message-ID: 17623.1261950751@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

"Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> writes:
> Tom Lane wrote:
>> That does look weird. Do we have a self-contained test case?

> I've been tinkering with this and I now have a self-contained test
> case (SQL statements and run results attached). I've debugged through
> it and things don't seem right in set_append_rel_pathlist, since
> childrel->rows seems to contain the total rows in each table rather
> than the number which meet the join conditions.

Yeah, that is expected. Nestloop inner indexscans have a rowcount
estimate that is different from that of the parent table --- the
parent's rowcount is what would be applicable for another type of
join, such as merge or hash, where the join condition is applied at
the join node not in the relation scan.

The problem here boils down to the fact that examine_variable punts on
appendrel variables:

else if (rte->inh)
{
/*
* XXX This means the Var represents a column of an append
* relation. Later add code to look at the member relations and
* try to derive some kind of combined statistics?
*/
}

This means you get a default estimate for the selectivity of the join
condition, so the joinrel size estimate ends up being 0.005 * 1 * 40000.
That's set long before we ever generate indexscan plans, and I don't
think there's any clean way to correct the size estimate when we do.

Fixing this has been on the to-do list since forever. I don't think
we'll make much progress on it until we have an explicit notion of
partitioned tables. The approach contemplated in the comment, of
assembling some stats on-the-fly from the stats for individual child
tables, doesn't seem real practical from a planning-time standpoint.
The thing that you really want to know here is that there will be only
one matching id value in the whole partitioned table; and that would be
something trivial to know if we understood about partitioning keys,
but it's difficult to extract from independent sets of stats.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Kevin Grittner 2009-12-28 15:09:26 Re: Order by (for 15 rows) adds 30 seconds to query time
Previous Message Glyn Astill 2009-12-27 15:36:00 Re: SATA drives performance