Skip site navigation (1) Skip section navigation (2)

Re: Explain explained

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Markus Stocker" <markus(at)wilabs(dot)ch>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Explain explained
Date: 2008-03-04 18:26:17
Message-ID: 17042.1204655177@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-novice
"Markus Stocker" <markus(at)wilabs(dot)ch> writes:
> 1/ How does postgresql know that the constrain individual_name.name =
> 'http://www.University0.edu' matches one row (which is in fact true)?

It doesn't *know* that, except in the special case where it sees there's
a unique index on that column.  You didn't show us the table definitions
so it's not clear if that's the case or not.  Otherwise it's just
estimating based on the pg_stats statistics for the column.  There's
a new section in the 8.3 docs that gives some more details about the
estimation process:
http://www.postgresql.org/docs/8.3/static/planner-stats-details.html

> Further, why expects postgresql
> for each operation a row size of 1?

It's not expecting more than one join partner at each step.  Again
that's just statistical.

> 2/ Sequential scans seem to me more expensive compared to index scans.
> I'm wondering why the sequential scan on individual_name is the first
> executed in the plan.

I was wondering that too; it looks like it should be a candidate for an
index search.  Datatype problem maybe?  Again, you've not shown us the
table definitions...

> 3/ There is one condition in the query, i.e. concept_assertion.concept
> = 5 with an empty result set, i.e. selectivity 0. In fact, the last
> index scan on concept_assertion ca_1 in the plan is never executed
> (this is what 'explain analyze' tells me). I'm wondering, why this
> constrain is not executed first. By executing this first, we could
> just never execute everything else.

Postgres never uses a rowcount estimate of less than one row,
so it's not going to try to optimize that way.  There are various
rationales behind that choice, but the main one is that we don't
trust the statistics unreservedly.  The odds that an estimate of zero
is more accurate than an estimate of one just aren't good enough,
and the likelihood of generating a really awful plan if we did believe
zero rows is too high.  (In fact, I've been considering whether it
wouldn't be a good idea to enforce a minimum estimate of two rows
whenever we don't see a unique index proving there can be only one.
This would help discourage the planner from using nestloops in cases
where a nestloop loses badly for more than one matching row.)

			regards, tom lane

In response to

Responses

pgsql-novice by date

Next:From: Markus StockerDate: 2008-03-04 20:52:30
Subject: Re: Explain explained
Previous:From: Rodrigo E. De León PlicetDate: 2008-03-04 15:21:07
Subject: Re: Explain explained

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group