Re: Enhanced containment selectivity function

From: Matteo Beccati <php(at)beccati(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: Enhanced containment selectivity function
Date: 2005-08-04 17:09:30
Message-ID: 42F24BCA.2030802@beccati.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Tom Lane wrote:
> After looking at this a little, it doesn't seem like it has much to do
> with the ordinary 2-D notion of containment. In most of the core
> geometric types, the "histogram" ordering is based on area, and so
> testing the histogram samples against the query doesn't seem like it's
> able to give very meaningful containment results --- the items shown
> in the histogram could have any locations whatever.
>
> The approach might be sensible for ltree's isparent operator --- I don't
> have a very good feeling for the behavior of that operator, but it looks
> like it has at least some relationship to the ordering induced by the
> ltree < operator.

Actually, this was one of my doubts. The custom function seem to work
well with ltree, but this also could be dependant from the way my
dataset is organized.

> So my thought is that (assuming Oleg and Teodor agree this is sensible
> for ltree) we should put the selectivity function into contrib/ltree,
> not directly into the core. It might be best to call it something like
> "parentsel", too, to avoid giving the impression that it has something
> to do with 2-D containment.
>
> Also, you should think about using the most-common-values list as well
> as the histogram. I would guess that many ltree applications would have
> enough duplicate entries that the MCV list represents a significant
> fraction of the total population. Keep in mind when thinking about this
> that the histogram describes the population of data *exclusive of the
> MCV entries*.

I also agree that "parentsel" would better fit its purpose.

My patch was originally using MCV without good results, until I realized
that MCV was empty because the column contains unique values :)
I'll look into adding a MCV check to it.

Moving it in contrib/ltree would be more difficult to me because it
depends on other functions declared in selfuncs.c
(get_restriction_variable, etc).

Thank you for your feedback

Best regards
--
Matteo Beccati
http://phpadsnew.com/
http://phppgads.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-08-04 17:15:28 Re: Enhanced containment selectivity function
Previous Message Tom Lane 2005-08-04 16:45:42 Re: pg_dump -- data and schema only?

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2005-08-04 17:15:28 Re: Enhanced containment selectivity function
Previous Message Tom Lane 2005-08-04 15:49:55 Re: Enhanced containment selectivity function