Re: Stats for multi-column indexes

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Richard Huxton <dev(at)archonet(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Mark Kirkwood <markir(at)paradise(dot)net(dot)nz>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Stats for multi-column indexes
Date: 2007-03-20 17:46:56
Message-ID: 1174412816.23455.521.camel@dogma.v10.wvs
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 2007-03-20 at 18:12 +0100, Josh Berkus wrote:
> Tom,
>
> > Actually, I think you don't particularly need stats for that in most
> > cases --- if the planner simply took note that the FK relationship
> > exists, it would know that each row of the FK side joins to exactly
> > one row of the PK side, which in typical cases is sufficient.
>
> Is it? What about the other direction? Currently, doesn't the planner
> assume that the rowcount relationship is 1 to ( child total rows /
> parent total rows) ? That's ok for tables with relatively even
> distribution, but not for skewed ones.
>

In theory, the PK constrains the available values of the FK, but doesn't
provide any additional information about the relationship between the
columns.

However, in practice there is limited space to store MCVs and limited
accuracy to n_distinct. So there may be a reason to store more
information, but I don't know what we'd store. Do we have reports of bad
estimates by the planner in this situation?

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-03-20 17:49:06 Re: Stats for multi-column indexes
Previous Message Benjamin Arai 2007-03-20 17:41:55 SoC Ideas for people looking for projects