Re: Export user visible function to make use of convert_to_scalar

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "PostgreSQL-development Hackers" <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Export user visible function to make use of convert_to_scalar
Date: 2007-07-30 15:22:13
Message-ID: 87vec19asa.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches


"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
>> Attached is a patch which implements, as discussed briefly on -hackers, a
>> user-visible function to get at the information that convert_to_scalar uses to
>> generate selectivity estimates.
>
> This is an astonishingly bad idea, as it exposes and thereby sets in
> stone one of the worst, most in-need-of-rethinking areas in selfuncs.c.

No, it sets in stone only the concept that at some point somehow we'll have to
come up with an estimate of where a value lies in a range. That's inevitable
if we hope to be able to make any kind of useful estimates at all. How the
internals go about doing it isn't set in stone at all.

What it seems ought to happen here eventually is that each scalar type should
provide a function to implement convert_to_scalar for itself. That would let
new user-defined functions implement the function as appropriate.

> The way to not encourage it is to not provide it.

Well _some_ applications do need to use it. Applications which provide a
graphic view of the pg_statistics information. How would you suggest drawing a
chart of histogram values without access to this information?

The other method I thought of was to run EXPLAIN repeatedly with different
bounds and pick out the estimates from the output. This requires hard-wiring
into the application understanding of every data type and how to generate a
range of values between a range (or implementing width_bucket for every data
type) and executing a whole bucketload of EXPLAINs. And then the resulting
graphs would be a worse representation of the statistics since they wouldn't
correspond exactly to the buckets in the histograms.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Andrew Dunstan 2007-07-30 15:56:21 Re: use binary mode on syslog pipe on windows to avoid upsetting chunking protocol
Previous Message Tom Lane 2007-07-30 15:11:29 Re: use binary mode on syslog pipe on windows to avoid upsetting chunking protocol