Re: GiST support for UUIDs

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Adam Brusselback <adambrusselback(at)gmail(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: GiST support for UUIDs
Date: 2016-11-28 22:04:59
Message-ID: 30332.1480370699@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Adam Brusselback <adambrusselback(at)gmail(dot)com> writes:
> [ btree_gist_uuid_7.patch ]

I spent awhile looking at this. I have exactly no faith that it won't
crash on alignment-picky hardware, because this declaration:

union pg_uuid_t
{
unsigned char data[UUID_LEN];
uint64 v64[2];
};

is not the same as the existing pg_uuid_t; it instructs the compiler
that union pg_uuid_t must be double-aligned. The existing type is
only byte-aligned, and is declared that way in pg_type, which means
that values on-disk are quite likely not to meet the alignment
expectation.

I spent a little bit of time trying to get the patch to crash on a PPC
machine, without success, but the compiler I have on that (gcc 4.0.1) is
very old and is not aggressive about things like optimizing memcpy calls
on supposedly-aligned arguments. I think a modern gcc version on such
hardware would probably generate code that fails. (Also, PPC is
big-endian, so some of the at-risk code isn't used; a picky little-endian
machine would have more scope for crashing. Not real sure, but I think
ARM might be like that.)

What I would suggest is that you forget the union hack and just use
memcmp in all the comparison functions. It's not likely to be worth
the trouble to try to get those calls to be safely aligned. The
only place where you need go to any trouble is in uuid_parts_distance,
which could be redesigned to memcpy a not-necessarily-aligned source
into a local uint64[2] array and then byte-swap if needed.

(BTW, considering that we're going to return a float distance that has
only got twenty-something significant bits, do we *really* need to deal
with do-it-yourself int128 arithmetic in uuid_parts_distance? Color
me skeptical.)

Also, I can't say that I think it's good style to be duplicating the
declaration of pg_uuid_t out of uuid.c. (And it absolutely, positively,
is vile style to have two different declarations of the same struct in
different files, quite aside from whether they're ABI-compatible or not.)

I don't entirely see the point of making pg_uuid_t an opaque struct when
the only interesting thing about it, namely UUID_LEN, is exposed anyway.
So my inclination would be to just do

typedef struct pg_uuid_t
{
unsigned char data[UUID_LEN];
} pg_uuid_t;

in uuid.h and forget the idea of it being opaque.

Also, the patch could be made a good deal smaller (and easier to review)
in the wake of commit 40b449ae8: you no longer need to convert
btree_gist--1.2.sql into btree_gist--1.3.sql, just leave it alone and add
btree_gist--1.2--1.3.sql. That way we don't have to worry about whether
the upgrade script matches the change in the base script.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Artur Zakirov 2016-11-28 22:12:29 Re: [BUG?] pg_event_trigger_ddl_commands() error with ALTER TEXT SEARCH CONFIGURATION
Previous Message Andrew Dunstan 2016-11-28 21:52:53 Re: PSQL commands: \quit_if, \quit_unless