Skip site navigation (1) Skip section navigation (2)

Insert performance, what should I expect?

From: Brock Henry <brock(dot)henry(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Insert performance, what should I expect?
Date: 2004-10-20 01:53:37
Message-ID: 97b3fe2041019185337a8c3d8@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hi, 

I've after some opinions about insert performance.

I'm importing a file with 13,002 lines to a database that ends up with
75,703 records across 6 tables. This is a partial file – the real data
is 4 files with total lines 95174. I'll be loading these files each
morning, and then running a number of queries on them.

The select queries run fast enough, (mostly - 2 queries are slow but
I'll look into that later), but importing is slower than I'd like it
to be, but I'm wondering what to expect?

I've done some manual benchmarking running my script 'time script.pl'
I realise my script uses some of the time, bench marking shows that
%50 of the time is spent in dbd:execute.

Test 1, For each import, I'm dropping all indexes and pkeys/fkeys,
then importing, then adding keys and indexes. Then I've got successive
runs. I figure the reindexing will get more expensive as the database
grows?

Successive Imports: 44,49,50,57,55,61,72 (seconds)
= average 1051inserts/second (which now that I've written this seems
fairly good)

Test 2, no dropping etc of indexes, just INSERTs
Import – 61, 62, 73, 68, 78, 74 (seconds)
= average 1091 inserts/second

Machine is Linux 2.6.4, 1GB RAM, 3.something GHz XEON processor, SCSI
hdd's (raid1). PostgreSQL 7.4.2. Lightly loaded machine, not doing
much other than my script. Script and DB on same machine.

Sysctl –a | grep shm
kernel.shmmni = 4096
kernel.shmall = 134217728     (pages or bytes? Anyway…)
kernel.shmmax = 134217728

postgresql.conf
tcpip_socket = true
max_connections = 32
superuser_reserved_connections = 2
shared_buffers = 8192       
sort_mem = 4096               
vacuum_mem = 16384      
max_fsm_relations = 300  
fsync = true                    
wal_buffers = 64                
checkpoint_segments = 10         
effective_cache_size = 16000    
syslog = 1                      
silent_mode = false              
log_connections = true
log_pid = true
log_timestamp = true
stats_start_collector = true
stats_row_level = true

Can I expect it to go faster than this? I'll see where I can make my
script itself go faster, but I don't think I'll be able to do much.
I'll do some pre-prepare type stuff, but I don't expect significant
gains, maybe 5-10%. I'd could happily turn off fsync for this job, but
not for some other databases the server is hosting.

Any comments/suggestions would be appreciated.

Thanks :)

Brock Henry

Responses

pgsql-performance by date

Next:From: Matthew T. O'ConnorDate: 2004-10-20 02:12:10
Subject: Re: Insert performance, what should I expect?
Previous:From: Simon RiggsDate: 2004-10-20 00:33:16
Subject: Re: how much mem to give postgres?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group