Re: Load experimentation

From: Scott Mead <scott(dot)lists(at)enterprisedb(dot)com>
To: Ben Brehmer <benbrehmer(at)gmail(dot)com>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Load experimentation
Date: 2009-12-07 18:33:44
Message-ID: d3ab2ec80912071033x1e838e98n45bd3b4ef668d3a5@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Dec 7, 2009 at 1:12 PM, Ben Brehmer <benbrehmer(at)gmail(dot)com> wrote:

> Hello All,
>
> I'm in the process of loading a massive amount of data (500 GB). After some
> initial timings, I'm looking at 260 hours to load the entire 500GB. 10 days
> seems like an awfully long time so I'm searching for ways to speed this up.
> The load is happening in the Amazon cloud (EC2), on a m1.large instance:
> -7.5 GB memory
> -4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units each)
> -64-bit platform
>
>
> So far I have modified my postgresql.conf file (PostgreSQL 8.1.3). The
> modifications I have made are as follows:
>

Can you go with PG 8.4? That's a start :-)

>
> shared_buffers = 786432
> work_mem = 10240
> maintenance_work_mem = 6291456
> max_fsm_pages = 3000000
> wal_buffers = 2048
> checkpoint_segments = 200
> checkpoint_timeout = 300
> checkpoint_warning = 30
> autovacuum = off
>

I'd set fsync=off for the load, I'd also make sure that you're using the
COPY command (on the server side) to do the load.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Thom Brown 2009-12-07 18:39:29 Re: Load experimentation
Previous Message Kevin Grittner 2009-12-07 18:33:13 Re: Load experimentation