Re: Function Stats WAS: Passing arguments to views

From: Mark Dilger <pgsql(at)markdilger(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Function Stats WAS: Passing arguments to views
Date: 2006-02-03 19:19:29
Message-ID: 43E3ACC1.7010502@markdilger.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Josh Berkus wrote:
> Mark,
>
>
>>This would only seem to work for trivial functions. Most functions that
>>I write are themselves dependent on underlying tables, and without any
>>idea how many rows are in the tables, and without any idea of the
>>statistical distribution of those rows, I can't really say anything like
>>"average rows returned = 5".
>>
>>What I have wanted for some time is a function pairing system. For each
>>set returning function F() I create, I would have the option of creating
>>a statistics function S() which returns a single integer which
>>represents the guess of how many rows will be returned. S() would be
>>called by the planner, and the return value of S() would be used to
>>decide the plan. S() would need access to the table statistics
>>information. I imagine that the system would want to prevent S() from
>>running queries, and only allow it to call certain defined table
>>statistics functions and some internal math functions, thereby avoiding
>>any infinite recursion in the planner. (If S() ran any queries, those
>>queries would go yet again to the planner, and on down the infinite
>>recursion you might go.)
>>
>>Of course, some (possibly most) people could chose not to write an S()
>>for their F(), and the default of 1000 rows would continue to be used.
>>As such, this new extension to the system would be backwards compatible
>>to functions which don't have an S() defined.
>
>
> I think this is a fine idea, and I think I endorsed it the first time.
> However, even a static "function returns #" would be better than what we
> have now, and I think the S() method could take quite a bit of engineering
> to work out (for example, what if F() is being called in a JOIN or
> correlated subquery?). So I'm worried that shooting for the S() idea only
> could result in us not doing *anything* for several more versions.
>
> What I'd like to do is implement the constant method for 8.2, and work on
> doing the S() method later on. Does that make sense?
>

I have no objections to implementing the constant method sooner than the full
version. It might be useful to implement it as a subsyntax to the full version
though in preparation for later expansion. For instance, if there is a plstats
language developed, you could limit the parser for it to just functions like:

CREATE FUNCTION S() RETURNS INTEGER PROVIDES FOR F() AS $$
RETURN 5;
$$ LANGUAGE PLSTATS;

Then the language could be expanded later to allow calls to the table statistics
functions.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Schuchardt 2006-02-03 19:25:48 Re: Error working with Temporary Sequences in plpgsql in 8.1 (8.0
Previous Message Josh Berkus 2006-02-03 19:17:21 Re: Multiple logical databases