From: | "Sokolov Yura" <funny(dot)falcon(at)gmail(dot)com> |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | BUG #3875: Building GIN index on full table could be slower than filling table with existing index. |
Date: | 2008-01-15 13:20:32 |
Message-ID: | 200801151320.m0FDKW7g035228@wwwmaster.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The following bug has been logged online:
Bug reference: 3875
Logged by: Sokolov Yura
Email address: funny(dot)falcon(at)gmail(dot)com
PostgreSQL version: 8.3RC1 8.2.x
Operating system: Debian 4.0
Description: Building GIN index on full table could be slower than
filling table with existing index.
Details:
Here is pathological case:
\timing
drop table if exists test_gin;
create table test_gin (
id serial primary key,
ar int4[]
);
create index test_gin_ix_ar on test_gin
using gin
( ar );
-- And here is pathological pattern
insert into test_gin (ar)
select ARRAY[i, i-1, i+1, i-2, i+2]
from generate_series(1, 20000) as i;
drop index test_gin_ix_ar;
create index test_gin_ix_ar on test_gin
using gin
( ar );
Timing for this case:
Timing is on.
DROP TABLE
Time: 19,421 ms
CREATE TABLE
Time: 13,531 ms
CREATE INDEX
Time: 3,352 ms
INSERT 0 20000 !
Time: 1064,444 ms !!
DROP INDEX
Time: 30,925 ms
CREATE INDEX !
Time: 28180,484 ms !!!!
So that, creating index on full table is 26.5x slower than filling table
with existing index.
After applying a patch below timings are:
.....
INSERT 0 20000
Time: 1036,746 ms
.....
CREATE INDEX
Time: 220,073 ms
For other tested pattern patched timing usually
slightly better (5-10%) and in rare case slightly
worse(5-10%). Other tested patterns are:
insert into test_gin (ar)
select ARRAY[i, i-1, i+1, i-2, i+2]
from generate_series(1, 100) as i,
generate_series(1, 200) as j;
insert into test_gin (ar)
select ARRAY[i, i-1, i+1, i-2, i+2]
from generate_series(1, 200) as j,
generate_series(1, 100) as i;
insert into test_gin (ar)
select ARRAY[i, i-1, i+1, i-2, i+2]
from generate_series(1, 200) as j,
generate_series(1, 1000) as i;
insert into test_gin (ar)
select (select int_array_aggregate(k) from generate_series(i, i+200) as
k)
from generate_series(1, 200) as j,
generate_series(1, 100) as i;
Patch:
diff -pr postgresql-8.3RC1old/src/backend/access/gin/ginbulk.c
postgresql-8.3RC1/src/backend/access/gin/ginbulk.c
*** postgresql-8.3RC1old/src/backend/access/gin/ginbulk.c 2008-01-01
22:45:46.000000000 +0300
--- postgresql-8.3RC1/src/backend/access/gin/ginbulk.c 2008-01-14
16:00:48.000000000 +0300
*************** ginInitBA(BuildAccumulator *accum)
*** 28,33 ****
--- 28,34 ----
accum->maxdepth = 1;
accum->stackpos = 0;
accum->entries = NULL;
+ accum->nentries = 0;
accum->stack = NULL;
accum->allocatedMemory = 0;
accum->entryallocator = NULL;
*************** EAAllocate(BuildAccumulator *accum)
*** 44,49 ****
--- 45,51 ----
}
accum->length++;
+ accum->nentries++;
return accum->entryallocator + accum->length - 1;
}
diff -pr postgresql-8.3RC1old/src/backend/access/gin/gininsert.c
postgresql-8.3RC1/src/backend/access/gin/gininsert.c
*** postgresql-8.3RC1old/src/backend/access/gin/gininsert.c 2008-01-01
22:45:46.000000000 +0300
--- postgresql-8.3RC1/src/backend/access/gin/gininsert.c 2008-01-14
16:07:21.000000000 +0300
*************** ginBuildCallback(Relation index, HeapTup
*** 238,244 ****
buildstate->indtuples += ginHeapTupleBulkInsert(buildstate, *values,
&htup->t_self);
/* If we've maxed out our available memory, dump everything to the index
*/
! if (buildstate->accum.allocatedMemory >= maintenance_work_mem * 1024L)
{
ItemPointerData *list;
Datum entry;
--- 238,246 ----
buildstate->indtuples += ginHeapTupleBulkInsert(buildstate, *values,
&htup->t_self);
/* If we've maxed out our available memory, dump everything to the index
*/
! if (buildstate->accum.allocatedMemory >= maintenance_work_mem * 1024L
! || (buildstate->accum.maxdepth > 16 &&
! buildstate->accum.nentries < (1 << (buildstate->accum.maxdepth /
2))))
{
ItemPointerData *list;
Datum entry;
diff -pr postgresql-8.3RC1old/src/include/access/gin.h
postgresql-8.3RC1/src/include/access/gin.h
*** postgresql-8.3RC1old/src/include/access/gin.h 2008-01-01
22:45:56.000000000 +0300
--- postgresql-8.3RC1/src/include/access/gin.h 2008-01-14 15:59:51.000000000
+0300
*************** typedef struct
*** 450,455 ****
--- 450,456 ----
{
GinState *ginstate;
EntryAccumulator *entries;
+ uint32 nentries;
uint32 maxdepth;
EntryAccumulator **stack;
uint32 stackpos;
From | Date | Subject | |
---|---|---|---|
Next Message | Ruben Camargo Gomez | 2008-01-15 16:52:13 | BUG #3876: Problems migrating databases |
Previous Message | Sokolov Yura | 2008-01-15 11:28:37 | Re: BUG #3860: xpath crashes backend when is querying xmlagg result |