Re: btree_gin and btree_gist for enums

From: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: btree_gin and btree_gist for enums
Date: 2017-02-23 18:16:07
Message-ID: db2b70a4-78d7-294a-a315-8e7f506c5978@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 11/05/2016 03:11 PM, Andrew Dunstan wrote:
>
>
> On 11/05/2016 01:13 PM, Tom Lane wrote:
>> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>>> On 11/05/2016 11:46 AM, Tom Lane wrote:
>>>> That may be a good fix for robustness purposes, but it seems pretty
>>>> horrid
>>>> from an efficiency standpoint. Where is this call, and should we be
>>>> modifying it to provide a flinfo?
>>> I thought of providing an flinfo, but I couldn't see a simple way to do
>>> it that would provide something much longer lived than the function
>>> call, in which case it seemed a bit pointless. That's why I asked for
>>> assistance :-)
>> Hmm. The problem is that the intermediate layer in btree_gist (and
>> probably btree_gin too, didn't look) does not pass through the
>> FunctionCallInfo data structure that's provided by the GIST AM.
>> That wasn't needed up to now, because none of the supported data types
>> are complex enough that any cached state would be useful, but trying
>> to extend it to enums exposes the shortcoming.
>>
>> We could fix this, but it would be pretty invasive since it would
>> require
>> touching just about every function in btree_gist/gin. Not entirely sure
>> that it's worth it. On the other hand, the problem might well come up
>> again in future, perhaps for a datatype where the penalty for lack of
>> a cache is greater --- eg, it would be pretty painful to support
>> record_cmp without caching. And the changes ought to be relatively
>> mechanical, even if they are extensive.
>
>
>
> Yeah. I think it's probably worth doing. btree_gin is probably a
> better place to start because it's largely macro-ized.

So looking at this we have:

static Datum
gin_btree_compare_prefix(FunctionCallInfo fcinfo)
{
Datum a = PG_GETARG_DATUM(0);
Datum b = PG_GETARG_DATUM(1);
QueryInfo *data = (QueryInfo *) PG_GETARG_POINTER(3);
int32 res,
cmp;

cmp = DatumGetInt32(DirectFunctionCall2Coll(
data->typecmp,
PG_GET_COLLATION(),
(data->strategy ==
BTLessStrategyNumber ||
data->strategy ==
BTLessEqualStrategyNumber)
? data->datum : a,
b));

and then the referred to routine in the enum case looks like this:

Datum
gin_enum_cmp(PG_FUNCTION_ARGS)
{
Oid a = PG_GETARG_OID(0);
Oid b = PG_GETARG_OID(1);
int res = 0;

if (ENUM_IS_LEFTMOST(a))
{
res = (ENUM_IS_LEFTMOST(b)) ? 0 : -1;
}
else if (ENUM_IS_LEFTMOST(b))
{
res = 1;
}
else
{
res = DatumGetInt32(DirectFunctionCall2(enum_cmp,
ObjectIdGetDatum(a),
ObjectIdGetDatum(b)));
}

PG_RETURN_INT32(res);
}

I'm not entirely sure how I should replace those DirectFunctionCall2 calls.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2017-02-23 18:23:41 Re: Patch: Write Amplification Reduction Method (WARM)
Previous Message Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?= 2017-02-23 18:10:22 Re: FYI: git worktrees as replacement for "rsync the CVSROOT"