Re: join selectivity

From: "Mark Cave-Ayland" <m(dot)cave-ayland(at)webbased(dot)co(dot)uk>
To: "'Tom Lane'" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <strk(at)refractions(dot)net>, <pgsql-hackers(at)postgresql(dot)org>, <postgis-devel(at)postgis(dot)refractions(dot)net>
Subject: Re: join selectivity
Date: 2004-12-16 17:45:08
Message-ID: 9EB50F1A91413F4FA63019487FCD251DAD22@WEBBASEDDC.webbasedltd.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> -----Original Message-----
> From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
> Sent: 16 December 2004 15:55
> To: Mark Cave-Ayland
> Cc: strk(at)refractions(dot)net; pgsql-hackers(at)postgresql(dot)org;
> postgis-devel(at)postgis(dot)refractions(dot)net
> Subject: Re: [HACKERS] join selectivity
>
>
> "Mark Cave-Ayland" <m(dot)cave-ayland(at)webbased(dot)co(dot)uk> writes:
> > ...and with two indices RESTRICT is called four times. The part I
> > find
> > confusing is why with one index that RESTRICT is called twice.
>
> [ shrug... ] clause_selectivity doesn't try to cache the result.

Hi Tom,

OK I think I've misunderstood something more fundamental than that; I
understood from what you said that the RESTRICT clause is used to evaluate
the cost of table1.geom && table2.geom against table2.geom && table1.geom
(i.e. it is used to help decide which one should be seq scanned and which
should be index scanned in a nested loop node). So is the trick here for a
commutative operator to simply return the same value for both cases, as
other factors such as index size costs are considered elsewhere?

My final question would be how would can we detect the difference between
RESTRICT being called in this manner (as part of <column> <op> <column> with
an unknown constant) as opposed to <column> <op> <constant> with a known
constant?

Many thanks,

Mark.

------------------------
WebBased Ltd
South West Technology Centre
Tamar Science Park
Plymouth
PL6 8BT

T: +44 (0)1752 791021
F: +44 (0)1752 791023
W: http://www.webbased.co.uk

> > I was also thinking whether calling RESTRICT when comparing with an
> > unknown value is worth doing at all, however I did think
> that perhaps
> > if you are using a cast to perform an operation on two
> datatypes, then
> > you may be able to imply something from the index, such as its
> > physical size, and hint that the planner should use a
> particular index
> > in preference for the other.
>
> That would be inappropriate; the index size is factored in elsewhere
> (gistcostestimate() to be specific). Restriction selectivity
> shouldn't directly consider the existence of indexes at all.
>
> > Would it be correct to assume that if returning the same value for
> > RESTRICT for both means that the planner will choose one at random?
>
> If the tables/indexes are exactly the same size then you'd
> get the same cost and the choice would be effectively random.
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Huxton 2004-12-16 17:54:02 Re: [Testperf-general] BufferSync and bgwriter
Previous Message Andrew Dunstan 2004-12-16 17:44:25 integer datetimes