Re: pgsql: Allow opclasses to provide tri-valued GIN consistent functions.

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-committers(at)postgresql(dot)org
Subject: Re: pgsql: Allow opclasses to provide tri-valued GIN consistent functions.
Date: 2014-03-23 13:22:45
Message-ID: 532EE025.30109@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers

On 03/22/2014 03:33 PM, Andrew Dunstan wrote:
>
> On 03/21/2014 06:43 PM, Heikki Linnakangas wrote:
>> On 03/21/2014 10:47 PM, Andres Freund wrote:
>>> On 2014-03-21 17:37:35 -0400, Tom Lane wrote:
>>>> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
>>>>> I think the GinLogicValueEnum is supposed to be an enum's name, not a
>>>>> variable name, right?
>>>>
>>>> I think the whole thing is too cute by half. Why isn't it just
>>>>
>>>> typedef enum GinLogicValue
>>>> {
>>>> GIN_FALSE = 0, /* item is present / matches */
>>>> GIN_TRUE = 1, /* item is not present / does not
>>>> match */
>>>> GIN_MAYBE = 2 /* don't know if item is present /
>>>> don't know if
>>>> * matches */
>>>> } GinLogicValue;
>>>>
>>>> instead of thinking that we are smarter than the compiler about how
>>>> to store the enum?
>>>
>>> It seems to be a memory only type, so using anything but the raw enum
>>> type seems odd. If it were ondisk alignment stuff could make it
>>> advantageous, but this way...
>>
>> That enum is used in the "check" arrays that are passed around GIN,
>> with one element per index term being searched. I'd really like to
>> keep it small, because the can be hundreds of elements long, and if
>> the compiler decides to store it as a 4- or 8-byte value instead of
>> one byte, it starts to add up.
>>
>> Besides memory usage, it's convenient that an array of GinLogicValues
>> is compatible with an array of booleans, as long as there are no
>> GIN_MAYBE values in it.
>>
>> I committed your original fix to make it an enum type, like it was
>> supposed to be. Thanks!
>
> I don't think there's any guarantee it's only going to be one byte. The
> ANSI C standard says:
>
> Each enumerated type shall be compatible with char, a signed integer
> type, or an unsigned integer type. The choice of type is
> implementation-defined. (6.7.2.2 Enumerationspecifiers)

Here's the code from gin.h (after fixing the issues Andres and Tom
pointed out):

> enum GinLogicValueEnum
> {
> GIN_FALSE = 0, /* item is not present / does not match */
> GIN_TRUE = 1, /* item is present / matches */
> GIN_MAYBE = 2 /* don't know if item is present / don't know if
> * matches */
> };
>
> typedef char GinLogicValue;

The reason for the typedef is precisely that an enum is not guaranteed
to be one byte. Tom suggested getting rid of the typedef, but it's
needed to make sure it's stored as one byte.

I'll go add a comment to it, explaining why it's needed.

- Heikki

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Andrew Dunstan 2014-03-23 14:10:44 Re: pgsql: Allow opclasses to provide tri-valued GIN consistent functions.
Previous Message Noah Misch 2014-03-23 07:49:52 pgsql: Don't test xmin/xmax columns of a postgres_fdw foreign table.