Re: Inlining comparators as a performance optimisation

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Geoghegan <peter(at)2ndquadrant(dot)com>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Inlining comparators as a performance optimisation
Date: 2011-12-02 03:22:32
Message-ID: CA+TgmoYpc1drkB_8bBhBa=KwpCfFp4=m5LjXObBMMi0UGC88QQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 1, 2011 at 9:46 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> IMO, entries in pg_proc ought to refer to functions that are callable
> by the standard interface: breaking that basic contract is not going to
> lead anywhere pleasant.  Nor do I really want yet more columns in
> pg_proc.

I wasn't proposing to create pg_proc entries for this.

> Nor does the "register a pointer" scheme you suggest make
> any sense to me: you still need to connect these things to catalog
> entries, pg_opclass entries in particular, and the intermediate handle
> adds nothing to the situation except for creating a risk of collisions.

I think you might be misinterpreting what I had in mind. Right now,
pg_amproc entries have an "amproc" column that points to a pg_proc
entry that in turn points to a function that takes
FunctionCallInfoData as an argument. What I'm proposing to do is add
an additional column to that catalog that points more or less directly
to a non-SQL-callable function, but it can't actually just be the
address of the function because that's not stable. So what I'm
proposing is that we interpose the thinnest possible shim layer
between the catalog and a function pointer, and an int64 -> function
pointer mapping seemed to me like something that would fit the bill.

> The scheme that was rolling around in my mind was about like this:
>
> * Define btree opclasses to have an optional support function,
> amprocnum 2, that has a SQL signature of func(internal) returns void.
>
> * The actual argument represented by the "internal" parameter would
> be a pointer to a struct which is to be filled in by the support
> function.  We'd call the support function once, during tuplesort
> setup or btree index relcache entry setup, and save the struct
> somewhere.
>
> * The struct contents would be pointers to functions that have a
> non-FunctionCallInfo interface.  We know we need at least two:
> a full qsort replacement, and a non-FCI comparator.  We might want
> more in future, if someone convinces us that additional specializations
> of sorting are worth the trouble.  So I imagine a struct like this:
>
>        typedef struct SortSupportFunctions {
>                void    (*inline_qsort) (Datum *elements, int nelements);
>                int     (*comparator) (Datum a, Datum b);
>        } SortSupportFunctions;
>
> with the agreement that the caller must zero out the struct before call,
> and then the btree support function sets the function pointers for any
> specializations it's going to offer.  If we later add a third or fourth
> function pointer, datatypes that know about that could fill in those
> pointers, but datatypes that haven't been updated aren't broken.
>
> One thing I'm not too certain about is whether to define the APIs just
> as above, or to support a passthrough argument of some sort (and if so,
> what does it reference)?  Possibly any datatype that we'd actually care
> about this for is going to be simple enough to not need any state data.
> Or possibly not.  And what about collations?

Maybe there should be a comparator_setup function that gets the
collation OID and returns void *, and then that void * value gets
passed as a third argument to each call to the comparator function.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2011-12-02 03:48:07 Re: Inlining comparators as a performance optimisation
Previous Message Tom Lane 2011-12-02 03:21:26 Re: Why so few built-in range types?