Quick Links

Re: Disk Performance Problem on Large DB

From:	Kenneth Marshall <ktm(at)rice(dot)edu>
To:	Jonathan Hoover <jhoover(at)yahoo-inc(dot)com>
Cc:	"pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject:	Re: Disk Performance Problem on Large DB
Date:	2010-11-04 21:02:50
Message-ID:	20101104210250.GL27429@aart.is.rice.edu
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-admin

On Thu, Nov 04, 2010 at 01:42:49PM -0700, Jonathan Hoover wrote:
> Hello,
>
> I have a RHEL 5 box, 4 GB RAM, single hard drive SATA, Intel Core 2 Duo 2.4. A basic workstation.
>
> I have a simple database, with one table for now. It has 4 columns:
>
> anid serial primary key unique,
> time timestamp,
> source varchar(5),
> unitid varchar(15),
> guid varchar(32)
>
> There is a btree index on each.
>
> I am loading data 1,000,000 (1M) rows at a time using psql and a COPY command. Once I hit 2M rows, my performance just drops out, and the next 1M never finishes. It takes 7 minutes for 1M rows to load. Once 2M are in there, I've waited an hour, and nothing. It doesn't seem to matter which 1M rows I try to load next, none ever finish. Each 1M rows is about 70MB on disk in the raw input file.
>
> I have "atop" installed, and it reports the drives at 100%, which it reports for the first 1M rows too. The MBw/s goes from 20+ on the first 2M rows, down to about 4 MBw/s or less now. The processor usage is at about 2 to 8% at this time (used by postgres).
>
> I have even waited for 1M rows to load, then done a vacuum for no good reason, then even restarted postgresql. I've made sure no disk or proc activity is happening before I start the next 1M rows. None of that seems to matter.
>
> I have a total of about 70M rows to load, but am at a standstill. I've read up on whatever performance docs I can find online, but I am not getting anywhere.
>
> I've increased shared_buffers to 256MB, and I've tried it with fsync commented out as per the default config. I've also tried it with fsync=off. No difference.
>
> Ideas? Thanks in advance,
> Jon

The initial 1M load if the table has just been truncated or created
has no WAL logging. You can boost maintenance_work_mem to increase
index creation/update performance. You are severely I/O limited and
would be better off dropping your indexes during the load and re-
creating them afterwards. If you are starting with an empty table,
truncate it and then load all the data in a single transaction, all
7 COPY commands. Then COMMIT and build the indexes. Your question
is also missing key information like config details, PostgreSQL version,
...

Cheers,
Ken

In response to

Disk Performance Problem on Large DB at 2010-11-04 20:42:49 from Jonathan Hoover

Responses

Re: Disk Performance Problem on Large DB at 2010-11-05 03:03:03 from Jonathan Hoover

Browse pgsql-admin by date

	From	Date	Subject
Next Message	Kevin Grittner	2010-11-04 21:03:35	Re: Disk Performance Problem on Large DB
Previous Message	Jonathan Hoover	2010-11-04 20:42:49	Disk Performance Problem on Large DB