gin fast insert performance

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: gin fast insert performance
Date: 2009-01-26 07:58:02
Message-ID: 1232956682.3045.56.camel@jdavis
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Here is a test of the fast insert patch. The patch has gone through some
changes, so this set of tests is to see the current performance impact
compared with HEAD.

The test is simple: inserting a bunch of integer arrays into a table
with a GIN index on the array column.

I'm testing with small work_mem and large work_mem because the smaller
work mem should be periodically flushing the pending list throughout a
large insert, while large work_mem should allow larger batches to build
up before flushing the pending list. For HEAD, larger work_mem should
have no effect.

HEAD (work_mem = 1MB):

10000 tuples, 1000 array elements each
insert: run1: 127859.012 ms; run2: 124092.098 ms
vacuum analyze: run1: 1681.521 ms; run2: 1688.483 ms

1000000 tuples, 10 array elements each
insert: run1: 122069.072 ms; run2: 116476.058 ms
vacuum analyze: run1: 3349.210 ms; run2: 2826.328 ms

With Fast Insert Patch (work_mem = 1MB):

10000 tuples, 1000 array elements each
insert: run1: 111376.670 ms; run2: 110733.455 ms
vacuum analyze: run1: 2078.427 ms; run2: 1781.152 ms

1000000 tuples, 10 array elements each
insert: run1: 65743.075 ms; run2: 65252.698 ms
vacuum analyze: run1: 3458.500 ms; run2: 3719.916 ms

With Fast Insert Patch (work_mem = 1GB):

10000 tuples, 1000 array elements each
insert: run1: 69419.986 ms; run2: 68985.635 ms
vacuum analyze: run1: 57521.067 ms; run2: 43385.682 ms

1000000 tuples, 10 array elements each
insert: run1: 48343.827 ms; run2: 49192.153 ms
vacuum analyze: run1: 21134.267 ms; run2: 20474.775 ms

With the fast insert patch, the total time for insert + vacuum isn't
much different with increased work_mem, but increased work_mem clearly
defers a lot of the work to VACUUM.

For comparison, building the index afterward is:
10000 tuples, 1000 array elements each
insert time: 7531.645 ms
index build time: 24393.737 ms
1000000 tuples, 10 array elements each
insert time: 11564.604 ms
index build time: 12753.891 ms

So, unfortunately, the fast insert patch does not appear to bring the
overall time anywhere close to building the index from scratch. When the
work_mem is set to 1GB, the VACUUM took about twice as long to run than
the entire index build. Teodor, can you explain why that might be?

It does show improvement, and I think my test case might just be too
small. It seems like a lot of time is used inserting into the pending
list, but it seems like it should be a simple operation. Maybe that can
be optimized?

Regards,
Jeff Davis

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gregory Stark 2009-01-26 08:08:25 Re: V4 of PITR performance improvement for 8.4
Previous Message Koichi Suzuki 2009-01-26 07:49:03 Re: V4 of PITR performance improvement for 8.4