Re: Extremely slow intarray index creation and inserts.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
Cc: Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Extremely slow intarray index creation and inserts.
Date: 2009-03-18 20:21:26
Message-ID: 15055.1237407686@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com> writes:
> Oleg Bartunov wrote:
> OB:> it's not about short or long arrays, it's about small or big
> OB:> cardinality of the whole set (the number of unique elements)

> I'm re-reading the docs and still wasn't obvious to me. A
> potential docs patch is attached below.

Done, though not in exactly those words. I wonder though if we can
be less vague about it --- can we suggest a typical cutover point?
Like "use gist__intbig_ops if there are more than about 10,000 distinct
array values"? Even a rough order of magnitude for where to worry
about this would save a lot of people time.

regards, tom lane

Index: intarray.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/intarray.sgml,v
retrieving revision 1.5
retrieving revision 1.6
diff -c -r1.5 -r1.6
*** intarray.sgml 10 Dec 2007 05:32:51 -0000 1.5
--- intarray.sgml 18 Mar 2009 20:18:18 -0000 1.6
***************
*** 237,245 ****
<para>
Two GiST index operator classes are provided:
<literal>gist__int_ops</> (used by default) is suitable for
! small and medium-size arrays, while
<literal>gist__intbig_ops</> uses a larger signature and is more
! suitable for indexing large arrays.
</para>

<para>
--- 237,246 ----
<para>
Two GiST index operator classes are provided:
<literal>gist__int_ops</> (used by default) is suitable for
! small- to medium-size data sets, while
<literal>gist__intbig_ops</> uses a larger signature and is more
! suitable for indexing large data sets (i.e., columns containing
! a large number of distinct array values).
</para>

<para>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2009-03-18 20:26:45 Re: Proposal of tunable fix for scalability of 8.4
Previous Message Scott Carey 2009-03-18 17:43:18 Re: Proposal of tunable fix for scalability of 8.4