Re: BUG #5885: Strange rows estimation for left join

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #5885: Strange rows estimation for left join
Date: 2011-02-15 17:18:07
Message-ID: 23761.1297790287@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com> writes:
> Test case look like:

> create table "references" ( attr_id integer, reference integer,
> object_id integer );
> insert into "references" select *100**(random()),
> *100000**(random()^*10*), *1000000**(random()) from
> generate_series(*1*,*10000000*);
> create index xif01references on "references" ( reference, attr_id );
> create index xif02references on "references" ( object_id, attr_id, reference );

> analyze "references";

> explain select * from "references" rs left join "references" vm on
> vm.reference = rs.reference and vm.attr_id = *10* where rs.object_id =
> *1000*;

I don't believe there's actually anything very wrong here. The
large-looking estimate for the join size is not out of line: if you try
different values for object_id you will find that some produce more rows
than that and some produce less. If we had cross-column stats we could
maybe derive a better estimate, but as-is you're getting an estimate
that is probably about right on the average, depending on whether the
particular object_id matches to more common or less common reference
values.

The thing that looks funny is that the inner indexscan rowcount estimate
is so small, which is because that's being done on the assumption that
the passed-in rs.reference value is random. It's not really --- it's
more likely to be one of the more common reference values --- which is
something that's correctly accounted for in the join size estimate but
not in the inner indexscan estimate.

regards, tom lane

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Nick Baryshnikov 2011-02-15 17:22:31 BUG #5887: EXPLAIN returns more than 1000 rows when table is empty
Previous Message Kevin Grittner 2011-02-15 14:44:57 Re: BUG #5886: pgbench usage