Mass file imports

From: Greg Spiegelberg <gspiegelberg(at)cranel(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Mass file imports
Date: 2003-07-21 18:51:36
Message-ID: 3F1C3638.8060100@cranel.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hello,

I'm hunting for some advice on loading 50,000+ files all less than
32KB to a 7.3.2 database. The table is simple.

create table files (
id int8 not null primary key,
file text not null,
size int8 not null,
uid int not null,
raw oid
);

The script (currently bash) pulls a TAR file out of a queue, unpacks it
to a large ramdisk mounted with noatime and performs a battery of tests
on the files included in the TAR file. For each file in the TAR is will
add the following to a SQL file...

update files set raw=lo_import('/path/to/file/from/tar') where
file='/path/to/file/from/tar';

This file begins with BEGIN; and ends with END; and is fed to Postgres
via a "psql -f sqlfile" command. This part of the process can take
anywhere from 30 to over 90 minutes depending on the number of files
included in the TAR file.

System is a RedHat 7.3 running a current 2.4.20 RedHat kernel and
dual PIII 1.4GHz
2GB of memory
512MB ramdisk (mounted noatime)
mirrored internal SCSI160 10k rpm drives for OS and swap
1 PCI 66MHz 64bit QLA2300
1 Gbit SAN with several RAID5 LUN's on a Hitachi 9910

All filesystems are ext3.

Any thoughts?

Greg

--
Greg Spiegelberg
Sr. Product Development Engineer
Cranel, Incorporated.
Phone: 614.318.4314
Fax: 614.431.8388
Email: gspiegelberg(at)Cranel(dot)com
Cranel. Technology. Integrity. Focus.

Browse pgsql-performance by date

  From Date Subject
Next Message Bruce Momjian 2003-07-21 19:07:32 Re: Dual Xeon + HW RAID question
Previous Message alexandre paes :: aldeia digital 2003-07-21 18:07:06 Re: Dual Xeon + HW RAID question