Re: Progress on fast path sorting, btree index creation time

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PG Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Progress on fast path sorting, btree index creation time
Date: 2012-01-31 19:47:19
Message-ID: CA+TgmoY+2ZTt82nzp+GX6OevQkEpWv5KFhn8yzxGDWJnUwp9kQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 27, 2012 at 3:33 PM, Peter Geoghegan <peter(at)2ndquadrant(dot)com> wrote:
> Patch is attached. I have not changed the duplicate functions. This is
> because I concluded that it was the lesser of two evils to have to get
> the template to generate both declarations in the header file, and
> definitions in the .c file - that seemed particularly obscure. We're
> never going to have to expose/duplicate any more comparators anyway.
> Do you agree?

Not really. You don't really need macros to generate the prototypes;
you could just write them out longhand.

I think there's a mess of naming confusion in here, though, as perhaps
best illlustrated by this macro definition:

#define TEMPLATE_QSORT_ARG_HEAP(TYPE, COMPAR) \
DO_TEMPLATE_QSORT_ARG(TYPE, COMPAR, inlheap, \
SING_ADDITIONAL_CODE, TYPE##inlheapcomparetup_inline) \
DO_TEMPLATE_QSORT_ARG(TYPE, COMPAR, regheap, \
MULT_ADDITIONAL_CODE(TYPE##regheapAppFunc), \
TYPE##regheapcomparetup_inline)

The idea here is that when we have only a single sort key, we include
SING_ADDITIONAL_CODE in the tuple comparison function, whereas when we
have more than one, we instead include MULT_ADDITIONAL_CODE. Right
there, I think we have a naming problem, because abbreviating "single"
to "sing" and multiple to "mult" is less than entirely clear. For a
minute or two I was trying to figure out whether our sorting code was
musically inclined, and I'm a native english speaker. But then we
switch to another set of terminology completely for the generated
functions: inlheap for the single-key case, and regheap for the
multiple-key case. I find that even more confusing.

I think we ought to get rid of this:

+typedef enum TypeCompar
+{
+ TYPE_COMP_OTHER,
+ TYPE_COMP_INT4,
+ TYPE_COMP_INT8,
+ TYPE_COMP_FLOAT4,
+ TYPE_COMP_FLOAT8,
+ TYPE_COMP_FULL_SPECIALISATION
+} TypeCompar;

Instead, just modify SortSupportData to have two function pointers as
members, one for the single-key case and another for the multiple-key
case, and have the sortsupport functions initialize them to the actual
functions that should be called. The layer of indirection, AFAICS,
serves no purpose.

> It's pretty easy to remove a specialisation at any time - just remove
> less than 10 lines of code. It's also pretty difficult to determine,
> with everyone's absolute confidence, where the right balance lies.
> Perhaps the sensible thing to do is to not be so conservative in what
> we initially commit, while clearly acknowledging that we may not have
> the balance right, and that it may have to change. We then have the
> entire beta part of the cycle in which to decide to roll back from
> that position, without any plausible downside. If, on the other hand,
> we conservatively lean towards fewer specialisations in the initial
> commit, no one will complain about the lack of an improvement in
> performance that they never had.

Eh, really? Typically when we do something good, the wolves are
howling at the door to make it work in more cases.

> I think that possibly the one remaining blocker to tentatively
> committing this with all specialisations intact is that I haven't
> tested this on Windows, as I don't currently have access to a Windows
> development environment. I have set one up before, but it's a huge
> pain. Can anyone help me out?

This doesn't strike me as terribly OS-dependent, unless by that we
mean compiler-dependent.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2012-01-31 19:49:41 Re: JSON for PG 9.2
Previous Message Peter Geoghegan 2012-01-31 19:42:19 Re: Issues with C++ exception handling in an FDW