Skip site navigation (1) Skip section navigation (2)

Re: WIP: cross column correlation ...

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL - Hans-Jürgen Schönig <postgres(at)cybertec(dot)at>, pgsql-hackers Hackers <pgsql-hackers(at)postgresql(dot)org>, Boszormenyi Zoltan <zb(at)cybertec(dot)at>
Subject: Re: WIP: cross column correlation ...
Date: 2011-02-24 15:07:35
Message-ID: 201102241507.p1OF7ZU25175@momjian.us (view raw or flat)
Thread:
Lists: pgsql-hackers
Robert Haas wrote:
> On Wed, Feb 23, 2011 at 10:30 PM, Bruce Momjian <bruce(at)momjian(dot)us> wrote:
> > Robert Haas wrote:
> >> If you want to take the above as in any way an exhaustive survey of
> >> the landscape (which it isn't), C seems like a standout, maybe
> >> augmented by the making the planner able to notice that A1 = x1 AND A2
> >> = x2 is equivalent to (A1,A2) = (x1, x2) so you don't have to rewrite
> >> queries as much.
> >>
> >> I don't really know how to handle the join selectivity problem. ?I am
> >> not convinced that there is a better solution to that than decorating
> >> the query. ?After all the join selectivity depends not only on the
> >> join clause itself, but also on what you've filtered out of each table
> >> in the meantime.
> >
> > Thinking some more, I think another downside to the "decorate the query"
> > idea is that many queries use constants that are supplied only at
> > runtime, so there would be no way to hard-code a selectivity value into
> > a query when you don't know the value. ?Could a selectivity function
> > handle that?
> 
> Beats me.  What do you have in mind?

My point is just that many queries have constants who's values are not
known at the time the query is written, so any system should have a way
to handle that somehow.  This is why query decoration is usually not a
good solution, and why something more flexible that is stored as part of
the column is preferred.

Perhaps a selectivity function that has easy access to the computed
selectivity of the constant involved might be a win.  For example, for
the zip code/state code case we could have something like:

	function mysel(zip, state) { pgsel(zip)}

meaning we would still use the selectivities found in the optimizer
statistics (pgsel), but modify them in some way.  In the case above, the
selectivity only comes from the zip code.  You could also do things like:

	function mysel(x, y) { pgsel(x) * pgsel(y) * 0.001}

Such functions have a higher probability of working for all queries
involving that column.

-- 
  Bruce Momjian  <bruce(at)momjian(dot)us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +

In response to

pgsql-hackers by date

Next:From: Tom LaneDate: 2011-02-24 15:21:41
Subject: Re: Review: Fix snapshot taking inconsistencies
Previous:From: Alvaro HerreraDate: 2011-02-24 15:06:17
Subject: Re: Review: Fix snapshot taking inconsistencies

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group