Re: user-defined numeric data types triggering ERROR: unsupported type

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: user-defined numeric data types triggering ERROR: unsupported type
Date: 2017-09-23 14:38:32
Message-ID: 4b444362-4240-49e0-3523-724e5932b060@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/21/2017 04:24 PM, Tom Lane wrote:
> Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>> [ scalarineqsel may fall over when used by extension operators ]
>
> I concur with your thought that we could have
> ineq_histogram_selectivity fall back to a "mid bucket" default if
> it's working with a datatype it is unable to convert_to_scalar. But I
> think if we're going to touch this at all, we ought to have higher
> ambition than that, and try to provide a mechanism whereby an
> extension that's willing to work a bit harder could get that
> additional increment of estimation accuracy. I don't care for this
> way to do that:
>

The question is - do we need a solution that is back-patchable? This
means we can't really use FIXEDDECIMAL without writing effectively
copying a lot of the selfuncs.c stuff, only to make some minor fixes.

What about using two-pronged approach:

1) fall back to mid bucket in back branches (9.3 - 10)
2) do something smarter (along the lines you outlined) in PG11

I'm willing to spend some time on both, but (2) alone is not a
particularly attractive for us as it only helps PG11 and we'll have to
do the copy-paste evil anyway to get the data type working on back branches.

>> * Make convert_numeric_to_scalar smarter, so that it checks if
>> there is an implicit cast to numeric, and fail only if it does not
>> find one.
>
> because it's expensive, and it only works for numeric-category cases,
> and it will fail outright for numbers outside the range of "double".
> (Notice that convert_numeric_to_scalar is *not* calling the regular
> cast function for numeric-to-double.) Moreover, any operator ought to
> know what types it can accept; we shouldn't have to do more catalog
> lookups to figure out what to do.
>

Ah, I see. I haven't realized it's not using the regular cast functions,
but now that you point that out it's clear relying on casts would fail.

> Now that scalarltsel and friends are just trivial wrappers for a
> common function, we could imagine exposing scalarineqsel_wrapper as a
> non-static function, with more arguments (and perhaps a better-chosen
> name ;-)). The idea would be for extensions that want to go this
> extra mile to provide their own selectivity estimation functions,
> which again would just be trivial wrappers for the core function, but
> would provide additional knowledge through additional arguments.
>
> The additional arguments I'm envisioning are a couple of C function
> pointers, one function that knows how to convert the operator's
> left-hand input type to scalar, and one function that knows how to
> convert the right-hand type to scalar. (Identical APIs of course.)
> Passing a NULL would imply that the core code must fall back on its
> own devices for that input.
>
> Now the thing about convert_to_scalar is that there are several
> different conversion conventions already (numeric, string, timestamp,
> ...), and there probably could be more once extension types are
> coming to the party. So I'm imagining that the API for these
> conversion functions could be something like
>
> bool convert(Datum value, Oid valuetypeid,
> int *conversion_convention, double *converted_value);
>
> The conversion_convention output would make use of some agreed-on
> constants, say 1 for numeric, 2 for string, etc etc. In the core
> code, if either converter fails (returns false) or if they don't
> return the same conversion_convention code, we give up and use the
> mid-bucket default. If they both produce results with the same
> conversion_convention, then we can treat the converted_values as
> commensurable.
>

OK, so instead of re-implementing the whole function, we'd essentially
do just something like this:

#typedef bool (*convert_callback) (Datum value, Oid valuetypeid, \
int *conversion_convention, \
double *converted_value);

double
scalarineqsel(PlannerInfo *root, Oid operator,
bool isgt, bool iseq, VariableStatData *vardata,
Datum constval, Oid consttype,
convert_calback convert);

and then, from the extension

double
scalarineqsel_wrapper(PlannerInfo *root, Oid operator,
bool isgt, bool iseq, VariableStatData *vardata,
Datum constval, Oid consttype)
{
return scalarineqsel(root, operator, isgt, iseq, vardata,
constval, consttype, my_convert_func);
}

Sounds reasonable to me, I guess - I can't really think about anything
simpler giving us the same flexibility.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-09-23 15:16:45 Re: BUG #14825: enum type: unsafe use?
Previous Message Peter Eisentraut 2017-09-23 14:22:17 Re: OpenFile() Permissions Refactor