Re: Extremely slow intarray index creation and inserts.

From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Extremely slow intarray index creation and inserts.
Date: 2009-03-17 18:28:35
Message-ID: 49BFEBD3.4000006@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Ron Mayer wrote:
> This table summarizes some of the times, shown more completely
> in a script below.
> =================================================================
> create gist index on 10000 = 5 seconds
> create gist index on 20000 = 32 seconds
> create gist index on 30000 = 39 seconds
> create gist index on 40000 = 102 seconds
> create gist index on 70000 = I waited 10 minutes before giving up

Finished after 34 minutes.

vm=# create index "gist70000" on tmp_intarray_test using GIST (my_int_array gist__int_ops);
CREATE INDEX
Time: 2069836.856 ms

Is that expected, or does it sound like a bug to take over
half an hour to index 70000 rows of mostly 5 and 6-element
integer arrays?

> create gin index on 40000 = 0.7 seconds
> create gist index on 40000 = 5 seconds using gist__intbig_ops
>
> create gin index on 70000 = 1.0 seconds
> create gist index on 70000 = 9 seconds using gist__intbig_ops
> ==================================================================
>
> This surprised me for a number of reasons. The longest
> array in the table is 9 elements long, and most are 5 or 6
> so I'd have thought the default ops would have been better
> than the big_ops. Secondly, I thought gin inserts were expected
> to be slower than gist, but I'm finding them much faster.
>
> Nothing seems particular strange about the data. A dump
> of an excerpt of the table can be found at
> http://0ape.com/tmp/int_array.dmp
> (Yes, the production table had other columns; but this
> column alone is enough to demonstrate the problem.)
>
> Any thoughts what I'm doing wrong?
> Ron
>
> psql output showing the timing follows.
>
> ===============================================================================
> vm=# create table tmp_intarray_test as select tag_id_array as my_int_array from taggings;
> SELECT
> vm=# create table tmp_intarray_test_10000 as select * from tmp_intarray_test limit 10000;
> SELECT
> vm=# create table tmp_intarray_test_20000 as select * from tmp_intarray_test limit 20000;
> SELECT
> vm=# create table tmp_intarray_test_30000 as select * from tmp_intarray_test limit 30000;
> SELECT
> vm=# create table tmp_intarray_test_40000 as select * from tmp_intarray_test limit 40000;
> SELECT
> vm=# \timing
> Timing is on.
> vm=#
> vm=# create index "gist_10000 using GIST(my_int_array)" on tmp_intarray_test_10000 using GIST (my_int_array);
> CREATE INDEX
> Time: 5760.050 ms
> vm=# create index "gist_20000 using GIST(my_int_array)" on tmp_intarray_test_20000 using GIST (my_int_array);
> CREATE INDEX
> Time: 32500.911 ms
> vm=# create index "gist_30000 using GIST(my_int_array)" on tmp_intarray_test_30000 using GIST (my_int_array);
> CREATE INDEX
> Time: 39284.031 ms
> vm=# create index "gist_40000 using GIST(my_int_array)" on tmp_intarray_test_40000 using GIST (my_int_array);
> CREATE INDEX
> Time: 102572.780 ms
> vm=#
> vm=#
> vm=#
> vm=#
>
> vm=#
> vm=#
> vm=# create index "gin_40000" on tmp_intarray_test_40000 using GIN (my_int_array gin__int_ops);
> CREATE INDEX
> Time: 696.668 ms
> vm=# create index "gist_big_4000" on tmp_intarray_test_40000 using GIST (my_int_array gist__intbig_ops);
> CREATE INDEX
> Time: 5227.353 ms
> vm=#
> vm=#
> vm=#
> vm=# \d tmp_intarray_test
> Table "public.tmp_intarray_test"
> Column | Type | Modifiers
> --------------+-----------+-----------
> my_int_array | integer[] |
>
> vm=# select max(array_dims(my_int_array)) from tmp_intarray_test_30000;
> max
> -------
> [1:9]
> (1 row)
>
> Time: 119.607 ms
> vm=#
> vm=#
> vm=# select version();
> version
> -----------------------------------------------------------------------------------
> PostgreSQL 8.3.6 on i686-pc-linux-gnu, compiled by GCC gcc (Debian 4.3.3-1) 4.3.3
> (1 row)
>
> Time: 12.169 ms
>
> vm=# create index "gistbig70000" on tmp_intarray_test using GIST (my_int_array gist__intbig_ops);
> CREATE INDEX
> Time: 9156.886 ms
> vm=# create index "gin70000" on tmp_intarray_test using GIN (my_int_array gin__int_ops);
> CREATE INDEX
> Time: 1060.752 ms
> vm=# create index "gist7000" on tmp_intarray_test using GIST (my_int_array gist__int_ops);
> [.... it just sits here for 10 minutes or more ....]
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Jignesh K. Shah 2009-03-17 21:41:20 Re: Proposal of tunable fix for scalability of 8.4
Previous Message Ron Mayer 2009-03-17 17:09:36 Extremely slow intarray index creation and inserts.