Skip site navigation (1) Skip section navigation (2)

Re: Extremely slow intarray index creation and inserts.

From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Extremely slow intarray index creation and inserts.
Date: 2009-03-17 18:28:35
Message-ID: 49BFEBD3.4000006@cheapcomplexdevices.com (view raw or flat)
Thread:
Lists: pgsql-performance
Ron Mayer wrote:
> This table summarizes some of the times, shown more completely
> in a script below.
> =================================================================
> create gist index on 10000  =   5   seconds
> create gist index on 20000  =  32   seconds
> create gist index on 30000  =  39   seconds
> create gist index on 40000  = 102   seconds
> create gist index on 70000  = I waited 10 minutes before giving up

  Finished after 34 minutes.

vm=# create index "gist70000" on tmp_intarray_test using GIST (my_int_array gist__int_ops);
CREATE INDEX
Time: 2069836.856 ms

Is that expected, or does it sound like a bug to take over
half an hour to index 70000 rows of mostly 5 and 6-element
integer arrays?


> create gin  index on 40000  =   0.7 seconds
> create gist index on 40000  =   5   seconds using gist__intbig_ops
> 
> create gin  index on 70000  =   1.0 seconds
> create gist index on 70000  =   9   seconds using gist__intbig_ops
> ==================================================================
> 
> This surprised me for a number of reasons.   The longest
> array in the table is 9 elements long, and most are 5 or 6
> so I'd have thought the default ops would have been better
> than the big_ops.  Secondly, I thought gin inserts were expected
> to be slower than gist, but I'm finding them much faster.
> 
> Nothing seems particular strange about the data.  A dump
> of an excerpt of the table can be found at
> http://0ape.com/tmp/int_array.dmp
> (Yes, the production table had other columns; but this
> column alone is enough to demonstrate the problem.)
> 
>    Any thoughts what I'm doing wrong?
>    Ron
> 
> psql output showing the timing follows.
> 
> ===============================================================================
> vm=# create table tmp_intarray_test as select tag_id_array as my_int_array from taggings;
> SELECT
> vm=# create table tmp_intarray_test_10000 as select * from tmp_intarray_test limit 10000;
> SELECT
> vm=# create table tmp_intarray_test_20000 as select * from tmp_intarray_test limit 20000;
> SELECT
> vm=# create table tmp_intarray_test_30000 as select * from tmp_intarray_test limit 30000;
> SELECT
> vm=# create table tmp_intarray_test_40000 as select * from tmp_intarray_test limit 40000;
> SELECT
> vm=# \timing
> Timing is on.
> vm=#
> vm=# create index "gist_10000 using GIST(my_int_array)" on tmp_intarray_test_10000 using GIST (my_int_array);
> CREATE INDEX
> Time: 5760.050 ms
> vm=# create index "gist_20000 using GIST(my_int_array)" on tmp_intarray_test_20000 using GIST (my_int_array);
> CREATE INDEX
> Time: 32500.911 ms
> vm=# create index "gist_30000 using GIST(my_int_array)" on tmp_intarray_test_30000 using GIST (my_int_array);
> CREATE INDEX
> Time: 39284.031 ms
> vm=# create index "gist_40000 using GIST(my_int_array)" on tmp_intarray_test_40000 using GIST (my_int_array);
> CREATE INDEX
> Time: 102572.780 ms
> vm=#
> vm=#
> vm=#
> vm=#
> 
> vm=#
> vm=#
> vm=#  create index "gin_40000"     on tmp_intarray_test_40000 using GIN (my_int_array gin__int_ops);
> CREATE INDEX
> Time: 696.668 ms
> vm=#  create index "gist_big_4000" on tmp_intarray_test_40000 using GIST (my_int_array gist__intbig_ops);
> CREATE INDEX
> Time: 5227.353 ms
> vm=#
> vm=#
> vm=#
> vm=# \d tmp_intarray_test
>    Table "public.tmp_intarray_test"
>     Column    |   Type    | Modifiers
> --------------+-----------+-----------
>  my_int_array | integer[] |
> 
> vm=# select max(array_dims(my_int_array)) from tmp_intarray_test_30000;
>   max
> -------
>  [1:9]
> (1 row)
> 
> Time: 119.607 ms
> vm=#
> vm=#
> vm=# select version();
>                                       version
> -----------------------------------------------------------------------------------
>  PostgreSQL 8.3.6 on i686-pc-linux-gnu, compiled by GCC gcc (Debian 4.3.3-1) 4.3.3
> (1 row)
> 
> Time: 12.169 ms
> 
> vm=#  create index "gistbig70000" on tmp_intarray_test using GIST (my_int_array gist__intbig_ops);
> CREATE INDEX
> Time: 9156.886 ms
> vm=#  create index "gin70000" on tmp_intarray_test using GIN (my_int_array gin__int_ops);
> CREATE INDEX
> Time: 1060.752 ms
> vm=#  create index "gist7000" on tmp_intarray_test using GIST (my_int_array gist__int_ops);
>       [.... it just sits here for 10 minutes or more ....]
> 


In response to

Responses

pgsql-performance by date

Next:From: Jignesh K. ShahDate: 2009-03-17 21:41:20
Subject: Re: Proposal of tunable fix for scalability of 8.4
Previous:From: Ron MayerDate: 2009-03-17 17:09:36
Subject: Extremely slow intarray index creation and inserts.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group