Re: Are statistics gathered on function indexes?

From: Ray Ontko <rayo(at)ontko(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: Are statistics gathered on function indexes?
Date: 2002-06-27 18:54:21
Message-ID: 200206271854.NAA28020@shire.ontko.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Tom, et al,

Yes, thanks.

Another thing that we noticed is that when LIKE is used on an
index with a constant value like 'WILLIAMS%', a full table
scan occurs when the constant is 'W%', but the index is used
if the like string is 'WI%' or longer.

It seems to me that the selectivity of the string would vary
with the length of the string, perhaps as a fraction of the
length of the field. In other words, I would have expected
the selectivity to vary something like this:

W% -> 0.1
WI% -> 0.01
WIL% -> 0.001
WILL% -> 0.0001
WILLI% -> 0.00001
WILLIA% -> 0.000001
WILLIAM% -> 0.0000001
WILLIAMS% -> 0.00000001

In other words, if I only give one letter, then I might expect
to get about 1/10 of the table, and a full scan might make sense.
But the cost should continue to decline as I give longer and longer
strings, up to the length of the field.

Would this be a reasonable improvement to the optimizer?

Ray

[Charset iso-8859-1 unsupported, filtering to ASCII...]
> Tom-
>
> Thanks for the info-
>
> Based on your response plus some local issues, we're going to work around
> this by simply creating another column containing the results of the
> function & then index the column. That gets the results we want without
> tweaking something we may regret later.
>
> Stats for function indexes would be nice, so add our vote for it to wherever
> such things are tallied to come up with priorities.
>
> Regards,
>
> -Nick
>
> > -----Original Message-----
> > From: pgsql-admin-owner(at)postgresql(dot)org
> > [mailto:pgsql-admin-owner(at)postgresql(dot)org]On Behalf Of Tom Lane
> > Sent: Wednesday, June 26, 2002 10:37 PM
> > To: nickf(at)ontko(dot)com
> > Cc: pgsql-admin
> > Subject: Re: [ADMIN] Are statistics gathered on function indexes?
> >
> >
> > "Nick Fankhauser" <nickf(at)ontko(dot)com> writes:
> > > [see subject]
> >
> > Nope, they ain't. I agree they should be.
> >
> > > Can someone tell me how the cost is estimated for retrieving a
> > column based
> > > on a function that is indexed?
> >
> > It falls back to a default selectivity estimate, which is something
> > like 1% or 0.5% (depending on which version you are running).
> >
> > > Also, even with 2168 rows to gather, my experience based on cases where
> > > several thousand rows really are returned indicates that the index would
> > > still be a good choice. Is there a way to make the planner
> > favor index scans
> > > a bit more? (Other than the drastic set enable_seqscan to off.)
> >
> > I'd suggest reducing random_page_cost; we've seen a number of anecdotal
> > reports that the default of 4.0 is too high, though nothing systematic
> > enough to refute the experiments I did to get that number awhile back.
> > (IMHO anyway. Others may differ.)
> >
> > regards, tom lane
> >
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 3: if posting/reading through Usenet, please send an appropriate
> > subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> > message can get through to the mailing list cleanly
> >
> >
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>
>

----------------------------------------------------------------------
Ray Ontko rayo(at)ontko(dot)com Phone 1.765.935.4283 Fax 1.765.962.9788
Ray Ontko & Co. Software Consulting Services http://www.ontko.com/

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tim Ellis 2002-06-27 23:13:32 Constraints/Triggers information?
Previous Message Nick Fankhauser 2002-06-27 16:38:07 Re: Are statistics gathered on function indexes?