Insert performance, what should I expect?

From: Brock Henry <brock(dot)henry(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Insert performance, what should I expect?
Date: 2004-10-20 01:53:37
Message-ID: 97b3fe2041019185337a8c3d8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

I've after some opinions about insert performance.

I'm importing a file with 13,002 lines to a database that ends up with
75,703 records across 6 tables. This is a partial file – the real data
is 4 files with total lines 95174. I'll be loading these files each
morning, and then running a number of queries on them.

The select queries run fast enough, (mostly - 2 queries are slow but
I'll look into that later), but importing is slower than I'd like it
to be, but I'm wondering what to expect?

I've done some manual benchmarking running my script 'time script.pl'
I realise my script uses some of the time, bench marking shows that
%50 of the time is spent in dbd:execute.

Test 1, For each import, I'm dropping all indexes and pkeys/fkeys,
then importing, then adding keys and indexes. Then I've got successive
runs. I figure the reindexing will get more expensive as the database
grows?

Successive Imports: 44,49,50,57,55,61,72 (seconds)
= average 1051inserts/second (which now that I've written this seems
fairly good)

Test 2, no dropping etc of indexes, just INSERTs
Import – 61, 62, 73, 68, 78, 74 (seconds)
= average 1091 inserts/second

Machine is Linux 2.6.4, 1GB RAM, 3.something GHz XEON processor, SCSI
hdd's (raid1). PostgreSQL 7.4.2. Lightly loaded machine, not doing
much other than my script. Script and DB on same machine.

Sysctl –a | grep shm
kernel.shmmni = 4096
kernel.shmall = 134217728 (pages or bytes? Anyway…)
kernel.shmmax = 134217728

postgresql.conf
tcpip_socket = true
max_connections = 32
superuser_reserved_connections = 2
shared_buffers = 8192
sort_mem = 4096
vacuum_mem = 16384
max_fsm_relations = 300
fsync = true
wal_buffers = 64
checkpoint_segments = 10
effective_cache_size = 16000
syslog = 1
silent_mode = false
log_connections = true
log_pid = true
log_timestamp = true
stats_start_collector = true
stats_row_level = true

Can I expect it to go faster than this? I'll see where I can make my
script itself go faster, but I don't think I'll be able to do much.
I'll do some pre-prepare type stuff, but I don't expect significant
gains, maybe 5-10%. I'd could happily turn off fsync for this job, but
not for some other databases the server is hosting.

Any comments/suggestions would be appreciated.

Thanks :)

Brock Henry

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Matthew T. O'Connor 2004-10-20 02:12:10 Re: Insert performance, what should I expect?
Previous Message Simon Riggs 2004-10-20 00:33:16 Re: how much mem to give postgres?