Skip site navigation (1) Skip section navigation (2)

Re: Function Stats WAS: Passing arguments to views

From: Mark Dilger <pgsql(at)markdilger(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Subject: Re: Function Stats WAS: Passing arguments to views
Date: 2006-02-03 19:19:29
Message-ID: 43E3ACC1.7010502@markdilger.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Josh Berkus wrote:
> Mark,
> 
> 
>>This would only seem to work for trivial functions.  Most functions that
>>I write are themselves dependent on underlying tables, and without any
>>idea how many rows are in the tables, and without any idea of the
>>statistical distribution of those rows, I can't really say anything like
>>"average rows returned = 5".
>>
>>What I have wanted for some time is a function pairing system.  For each
>>set returning function F() I create, I would have the option of creating
>>a statistics function S() which returns a single integer which
>>represents the guess of how many rows will be returned.  S() would be
>>called by the planner, and the return value of S() would be used to
>>decide the plan.  S() would need access to the table statistics
>>information.  I imagine that the system would want to prevent S() from
>>running queries, and only allow it to call certain defined table
>>statistics functions and some internal math functions, thereby avoiding
>>any infinite recursion in the planner.  (If S() ran any queries, those
>>queries would go yet again to the planner, and on down the infinite
>>recursion you might go.)
>>
>>Of course, some (possibly most) people could chose not to write an S()
>>for their F(), and the default of 1000 rows would continue to be used. 
>>As such, this new extension to the system would be backwards compatible
>>to functions which don't have an S() defined.
> 
> 
> I think this is a fine idea, and I think I endorsed it the first time.  
> However, even a static "function returns #" would be better than what we 
> have now, and I think the S() method could take quite a bit of engineering 
> to work out (for example, what if F() is being called in a JOIN or 
> correlated subquery?).  So I'm worried that shooting for the S() idea only 
> could result in us not doing *anything* for several more versions.
> 
> What I'd like to do is implement the constant method for 8.2, and work on 
> doing the S() method later on.  Does that make sense?
> 

I have no objections to implementing the constant method sooner than the full 
version.  It might be useful to implement it as a subsyntax to the full version 
though in preparation for later expansion.  For instance, if there is a plstats 
language developed, you could limit the parser for it to just functions like:

CREATE FUNCTION S() RETURNS INTEGER PROVIDES FOR F() AS $$
	RETURN 5;
$$ LANGUAGE PLSTATS;

Then the language could be expanded later to allow calls to the table statistics 
functions.

In response to

pgsql-hackers by date

Next:From: Daniel SchuchardtDate: 2006-02-03 19:25:48
Subject: Re: Error working with Temporary Sequences in plpgsql in 8.1 (8.0
Previous:From: Josh BerkusDate: 2006-02-03 19:17:21
Subject: Re: Multiple logical databases

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group