Re: user-defined numeric data types triggering ERROR: unsupported type

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: user-defined numeric data types triggering ERROR: unsupported type
Date: 2017-09-21 14:24:02
Message-ID: 9389.1506003842@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
> [ scalarineqsel may fall over when used by extension operators ]

I concur with your thought that we could have ineq_histogram_selectivity
fall back to a "mid bucket" default if it's working with a datatype it
is unable to convert_to_scalar. But I think if we're going to touch this
at all, we ought to have higher ambition than that, and try to provide a
mechanism whereby an extension that's willing to work a bit harder could
get that additional increment of estimation accuracy. I don't care for
this way to do that:

> * Make convert_numeric_to_scalar smarter, so that it checks if there is
> an implicit cast to numeric, and fail only if it does not find one.

because it's expensive, and it only works for numeric-category cases,
and it will fail outright for numbers outside the range of "double".
(Notice that convert_numeric_to_scalar is *not* calling the regular
cast function for numeric-to-double.) Moreover, any operator ought to
know what types it can accept; we shouldn't have to do more catalog
lookups to figure out what to do.

Now that scalarltsel and friends are just trivial wrappers for a common
function, we could imagine exposing scalarineqsel_wrapper as a non-static
function, with more arguments (and perhaps a better-chosen name ;-)).
The idea would be for extensions that want to go this extra mile to
provide their own selectivity estimation functions, which again would
just be trivial wrappers for the core function, but would provide
additional knowledge through additional arguments.

The additional arguments I'm envisioning are a couple of C function
pointers, one function that knows how to convert the operator's
left-hand input type to scalar, and one function that knows how
to convert the right-hand type to scalar. (Identical APIs of course.)
Passing a NULL would imply that the core code must fall back on its
own devices for that input.

Now the thing about convert_to_scalar is that there are several different
conversion conventions already (numeric, string, timestamp, ...), and
there probably could be more once extension types are coming to the party.
So I'm imagining that the API for these conversion functions could be
something like

bool convert(Datum value, Oid valuetypeid,
int *conversion_convention, double *converted_value);

The conversion_convention output would make use of some agreed-on
constants, say 1 for numeric, 2 for string, etc etc. In the core
code, if either converter fails (returns false) or if they don't
return the same conversion_convention code, we give up and use the
mid-bucket default. If they both produce results with the same
conversion_convention, then we can treat the converted_values as
commensurable.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-09-21 14:42:09 Re: SCRAM in the PG 10 release notes
Previous Message amul sul 2017-09-21 14:22:32 Re: UPDATE of partition key