Re: BRIN indexes (was Re: Minmax indexes)

From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Nicolas Barbier <nicolas(dot)barbier(at)gmail(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: BRIN indexes (was Re: Minmax indexes)
Date: 2014-09-09 11:49:17
Message-ID: 540EE93D.7090301@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/08/2014 07:02 PM, Alvaro Herrera wrote:
> Here's version 18. I have renamed it: These are now BRIN indexes.
>
> I have fixed numerous race conditions and deadlocks. In particular I
> fixed this problem you noted:
>
> Heikki Linnakangas wrote:
>> Another race condition:
>>
>> If a new tuple is inserted to the range while summarization runs,
>> it's possible that the new tuple isn't included in the tuple that
>> the summarization calculated, nor does the insertion itself udpate
>> it.
>
> I did it mostly in the way you outlined, i.e. by way of a placeholder
> tuple that gets updated by concurrent inserters and then the tuple
> resulting from the scan is unioned with the values in the updated
> placeholder tuple. This required the introduction of one extra support
> proc for opclasses (pretty simple stuff anyhow).

Hmm. So the union support proc is only called if there is a race
condition? That makes it very difficult to test, I'm afraid.

It would make more sense to pass BrinValues to the support functions,
rather than DeformedBrTuple. An opclass'es support function should never
need to access the values for other columns.

Does minmaxUnion handle NULLs correctly?

minmaxUnion pfrees the old values. Is that necessary? What memory
context does the function run in? If the code runs in a short-lived
memory context, you might as well just let them leak. If it runs in a
long-lived context, well, perhaps it shouldn't. It's nicer to write
functions that can leak freely. IIRC, GiST and GIN runs the support
functions in a temporary context. In any case, it might be worth noting
explicitly in the docs which functions may leak and which may not.

If you add a new datatype, and define b-tree operators for it, what is
required to create a minmax opclass for it? Would it be possible to
generalize the functions in brin_minmax.c so that they can be reused for
any datatype (with b-tree operators) without writing any new C code? I
think we're almost there; the only thing that differs between each data
type is the opcinfo function. Let's pass the type OID as argument to the
opcinfo function. You could then have just a single minmax_opcinfo
function, instead of the macro to generate a separate function for each
built-in datatype.

In general, this patch is in pretty good shape now, thanks!

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2014-09-09 11:49:30 Re: [TODO] Process pg_hba.conf keywords as case-insensitive
Previous Message Heikki Linnakangas 2014-09-09 11:13:08 Re: LIMIT for UPDATE and DELETE