I've been doing some testing for the Bacula project, which uses
PostgreSQL as one of the databases in which it stores backup catalogs.
Insert times are critical in this environment, as the app may insert
millions of records a day.
I've been evaluating a schema change for Bacula that takes a field
that's currently stored as a gruesome-to-work-with base64-encoded
representation of a binary blob, and expands it into a set of integer
fields that can be searched, indexed, etc.
The table size of the expanded form is marginally smaller than for the
base64-encoded string version. However, INSERT times are *CONSIDERABLY*
greater for the version with more fields. It takes 1011 seconds to
insert the base64 version, vs 1290 seconds for the expanded-fields
version. That's a difference of 279 seconds, or 27%.
Despite that, the final table sizes are the same.
The SQL dump for the base64 version is 1734MB and the expanded one is
2189MB, about a 25% increase. Given that the final table sizes are the
same, is the slowdown likely to just be the cost of parsing the extra
SQL, converting the textual representations of the numbers, etc?
If I use tab-separated input and COPY, the original-format file is
1300MB and the expanded-structure format is 1618MB. The performance hit
on COPY-based insert is not as bad, at 161s vs 182s (13%), but still
Any ideas about what I might be able to do to improve the efficiency of
inserting records with many integer fields?
In case it's of interest, the base64 and expanded schema are:
CREATE TABLE file (
fileid bigint NOT NULL,
fileindex integer DEFAULT 0 NOT NULL,
jobid integer NOT NULL,
pathid integer NOT NULL,
filenameid integer NOT NULL,
markid integer DEFAULT 0 NOT NULL,
lstat text NOT NULL,
md5 text NOT NULL
CREATE TABLE file (
( Yes, those are the fields of a `struct lstat' ).
pgsql-performance by date
|Next:||From: PFC||Date: 2009-04-14 09:18:31|
|Subject: Re: difficulties with time based queries|
|Previous:||From: Grzegorz Jaśkiewicz||Date: 2009-04-13 18:47:18|
|Subject: Re: Postgres 8.x on Windows Server in production|