contrib/intarray (was Re: Fixing GIN for empty/null/full-scan cases)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "David E(dot) Wheeler" <david(at)kineticode(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: contrib/intarray (was Re: Fixing GIN for empty/null/full-scan cases)
Date: 2011-01-08 21:59:11
Message-ID: 16961.1294523951@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"David E. Wheeler" <david(at)kineticode(dot)com> writes:
> On Jan 7, 2011, at 4:19 PM, Tom Lane wrote:
>> Well, actually, I just committed it. If you want to test, feel free.
>> Note that right now only the anyarray && <@ @> operators are genuinely
>> fixed ... I plan to hack on tsearch and contrib pretty soon though.

> Hrm, the queries I wrote for this sort of thing use intarray:

I'm going to work on contrib/intarray first (before tsearch etc)
so that you can do whatever testing you want sooner.

One of the things that first got me annoyed about the whole GIN
situation is that intarray's definitions of the <@ and @> operators were
inconsistent with the core operators of the same names. I believe that
the inconsistency has to go away. Really the only reason that intarray
has its own versions of these operators at all is that it can be faster
than the generic anyarray versions in core. There seem to be three ways
in which intarray is simpler/faster than the generic operators:

* restricted to integer arrays
* restricted to 1-D arrays
* doesn't allow nulls in the arrays

The first of these is pretty important from a speed perspective, and
it's also basically free because of the type system: the parser won't
attempt to apply intarray's operators to anything that's not an integer
array. The second one seems a little more dubious. AFAICS the code
isn't actually exploiting 1-D-ness anywhere; it always uses
ArrayGetNItems() to compute the array size, for example. I propose that
we just drop that restriction and let it accept arrays that are
multidimensional, implicitly linearizing the elements in storage order.
(Any created arrays will still be 1-D though.)

The third restriction is a bit harder to decide what to do about.
If we keep it then intarray's <@ etc will throw errors in some cases
where core would not have. However, dealing with nulls will make the
code significantly uglier and probably slower than it is now; and that's
not work that I'm excited about doing right now anyway. So what I
propose for the moment is that intarray still have that restriction.
Maybe someone else will feel like fixing it later.

I will however fix the issue described here:
http://archives.postgresql.org/pgsql-bugs/2010-12/msg00032.php
that intarray sometimes throws "nulls not allowed" errors on
arrays that once contained nulls but now don't. That can be
done with a relatively localized patch --- we just need to look
a little harder when the ARR_HASNULL flag is set.

Comments, objections?

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2011-01-08 22:04:03 Re: contrib/intarray (was Re: Fixing GIN for empty/null/full-scan cases)
Previous Message Stephen Frost 2011-01-08 21:28:13 Re: DISCARD ALL ; stored procedures