Skip site navigation (1) Skip section navigation (2)

Performance problem with large insert statements

From: Barry Lind <barry(at)xythos(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Performance problem with large insert statements
Date: 2001-11-29 06:19:13
Message-ID: 3C05D361.8030009@xythos.com (view raw or flat)
Thread:
Lists: pgsql-hackers
In looking at some performance issues (I was trying to look at the 
overhead of toast) I found that large insert statements were very slow.

My test case involved reading a file (6M file in my tests) and inserting 
  it into the database into a "largeobject" like table defined as follows:

create table tblob1 (filename text, lastbyte integer, data bytea);

The first test program read the file 8000 bytes at a time and inserted 
them into the above table until the entire file was inserted.  This test 
program used a regular insert statement to do the inserting: (insert 
into tblob1 values (?,?,?))

For three runs of this test the average time to insert the 6M file into 
the database in 8000 byte rows (which ended up being 801 rows inserted 
into the table) was: 17.803 seconds

The second test read the file in 8000 byte chucks just like the first 
program but it used a function to do the insert and called the function 
via the FastPath API.  The function was:

CREATE FUNCTION BYTEA_WRITE (TEXT, INTEGER, BYTEA) RETURNS INTEGER
as '
BEGIN
   INSERT INTO TBLOB1 VALUES ($1, $2, $3);
RETURN 1;
END;'
language 'plpgsql'

For three runs of this test the average time to insert the 6M file into 
the database in 8000 byte parts was:  2.645


Thus using the insert statement was almost an order of magnitude slower 
that using the function (17.803 sec vs. 2.645 sec).

Reading the data back from the server via a standard select statement 
takes on average: 1.674 seconds.

I tried to run gprof to see where the time was going, but for some 
reason the gprof output on my gmon.out file doesn't have any timing 
information (all times are reported as 0.0) and I haven't been able to 
figure out why yet.  So I don't know what is taking up the bulk of the 
time  (I suspect it is either the decoding of the bytea data which the 
Fastpath function call avoids, or the parser which needs to parse 801 8K 
SQL statements vs. the function which has to parse one 100 byte statement.)

I have attached the two test programs (they are in java and use jdbc) 
and a SQL script that creates the table and function.

thanks,
--Barry

Attachment: create.sql
Description: text/plain (294 bytes)
Attachment: BlobTestFunction.java
Description: text/plain (4.1 KB)
Attachment: BlobTestInsert.java
Description: text/plain (3.2 KB)

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2001-11-29 06:27:26
Subject: Re: 7.2b3 pg_dump, general 7.2b3 comments
Previous:From: Joe ConwayDate: 2001-11-29 06:17:50
Subject: Re: 7.2b3 pg_dump, general 7.2b3 comments

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group