From: | Kenneth Marshall <ktm(at)rice(dot)edu> |
---|---|
To: | Jonathan Hoover <jhoover(at)yahoo-inc(dot)com> |
Cc: | "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: Disk Performance Problem on Large DB |
Date: | 2010-11-04 21:02:50 |
Message-ID: | 20101104210250.GL27429@aart.is.rice.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
On Thu, Nov 04, 2010 at 01:42:49PM -0700, Jonathan Hoover wrote:
> Hello,
>
> I have a RHEL 5 box, 4 GB RAM, single hard drive SATA, Intel Core 2 Duo 2.4. A basic workstation.
>
> I have a simple database, with one table for now. It has 4 columns:
>
> anid serial primary key unique,
> time timestamp,
> source varchar(5),
> unitid varchar(15),
> guid varchar(32)
>
> There is a btree index on each.
>
> I am loading data 1,000,000 (1M) rows at a time using psql and a COPY command. Once I hit 2M rows, my performance just drops out, and the next 1M never finishes. It takes 7 minutes for 1M rows to load. Once 2M are in there, I've waited an hour, and nothing. It doesn't seem to matter which 1M rows I try to load next, none ever finish. Each 1M rows is about 70MB on disk in the raw input file.
>
> I have "atop" installed, and it reports the drives at 100%, which it reports for the first 1M rows too. The MBw/s goes from 20+ on the first 2M rows, down to about 4 MBw/s or less now. The processor usage is at about 2 to 8% at this time (used by postgres).
>
> I have even waited for 1M rows to load, then done a vacuum for no good reason, then even restarted postgresql. I've made sure no disk or proc activity is happening before I start the next 1M rows. None of that seems to matter.
>
> I have a total of about 70M rows to load, but am at a standstill. I've read up on whatever performance docs I can find online, but I am not getting anywhere.
>
> I've increased shared_buffers to 256MB, and I've tried it with fsync commented out as per the default config. I've also tried it with fsync=off. No difference.
>
> Ideas? Thanks in advance,
> Jon
The initial 1M load if the table has just been truncated or created
has no WAL logging. You can boost maintenance_work_mem to increase
index creation/update performance. You are severely I/O limited and
would be better off dropping your indexes during the load and re-
creating them afterwards. If you are starting with an empty table,
truncate it and then load all the data in a single transaction, all
7 COPY commands. Then COMMIT and build the indexes. Your question
is also missing key information like config details, PostgreSQL version,
...
Cheers,
Ken
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2010-11-04 21:03:35 | Re: Disk Performance Problem on Large DB |
Previous Message | Jonathan Hoover | 2010-11-04 20:42:49 | Disk Performance Problem on Large DB |