Re: Function Stats WAS: Passing arguments to views

From: Mark Dilger <pgsql(at)markdilger(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Function Stats WAS: Passing arguments to views
Date: 2006-02-03 20:09:01
Message-ID: 43E3B85D.1060007@markdilger.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh Berkus wrote:
> Tom,
>
>
>>>What I'd like to do is implement the constant method for 8.2, and work
>>>on doing the S() method later on. Does that make sense?
>>
>>I'm not thrilled with putting in a stopgap that we will have to support
>>forever. The constant method is *clearly* inadequate for many (probably
>>most IMHO) practical cases. Where do you see it being of use?
>
>
> Well, mostly for the real-world use cases where I've run into SRF estimate
> issues, which have mostly been SRFs which return one row.
>
>
>>W.R.T. the estimator function method, the concern about recursion seems
>>misplaced. Such an estimator presumably wouldn't invoke the associated
>>function itself.
>
>
> No, but if you're calling the S() estimator in the context of performing a
> join, what do you supply for parameters?

I've been thinking about this more, and now I don't see why this is an issue.
When the planner estimates how many rows will be returned from a subquery that
is being used within a join, it can't know which "parameters" to use either.
(Parameters being whatever conditions the subquery will pivot upon which are the
result of some other part of the execution of the full query.) So it seems to
me that function S() is at no more of a disadvantage than the planner.

If I defined a function S(a integer, b integer) which provides an estimate for
the function F(a integer, b integer), then S(null, null) could be called when
the planner can't know what a and b are. S could then still make use of the
table statistics to provide some sort of estimate. Of course, this would mean
that functions S() cannot be defined strict.

>>I'm more concerned about coming up with a usable API
>>for such things. Our existing mechanisms for estimating operator
>>selectivities require access to internal planner data structures, which
>>makes it pretty much impossible to write them in anything but C. We'd
>>need something cleaner to have a feature I'd want to export for general
>>use.
>
>
> Yes -- we need to support the simplest case, which is functions that return
> either (a) a fixed number of rows, or (b) a fixed multiple of the number
> of rows passed to the function. These simple cases should be easy to
> build. For more complex estimation, I personally don't see a problem with
> forcing people to hack it in C.

Could we provide table statistics access functions in whatever higher-level
language S() is written in, or is there something fundamentally squirrelly about
the statistics that would make this impossible?

Also, since we haven't nailed down a language for S(), if we allowed any of sql,
plpgsql, plperl, plpython, etc, then we would need access methods for each,
which would place a burden on all PLs, right? That argument isn't strong enough
to make me lean either way; it's just an observation.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Woodward 2006-02-03 20:26:29 Re: Multiple logical databases
Previous Message Tom Lane 2006-02-03 19:55:52 Re: Function Stats WAS: Passing arguments to views