Skip site navigation (1) Skip section navigation (2)

Re: Load experimentation

From: Ben Brehmer <benbrehmer(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Cc: Thom Brown <thombrown(at)gmail(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, craig_james(at)emolecules(dot)com, kbuckham(at)applocation(dot)net, scott(dot)lists(at)enterprisedb(dot)com
Subject: Re: Load experimentation
Date: 2009-12-07 19:12:12
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
Thanks for the quick responses. I will respond to all questions in one 

By "Loading data" I am implying: "psql -U postgres -d somedatabase -f 
sql_file.sql".  The sql_file.sql contains table creates and insert 
statements. There are no indexes present nor created during the load.

OS: x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20080704 
(Red Hat 4.1.2-44)

PostgreSQL: I will try upgrading to latest version.

COPY command: Unfortunately I'm stuck with INSERTS due to the nature 
this data was generated (Hadoop/MapReduce).

Transactions: Have started a second load process with chunks of 1000 
inserts wrapped in a transaction. Its dropped the load time for 1000 
inserts from 1 Hour to 7 minutes :)

Disk Setup: Using a single disk Amazon image for the destination 
(database). Source is coming from an EBS volume. I didn't think there 
were any disk options in Amazon?



On 07/12/2009 10:39 AM, Thom Brown wrote:
> 2009/12/7 Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov 
> <mailto:Kevin(dot)Grittner(at)wicourts(dot)gov>>
>     Ben Brehmer <benbrehmer(at)gmail(dot)com <mailto:benbrehmer(at)gmail(dot)com>>
>     wrote:
>     > -7.5 GB memory
>     > -4 EC2 Compute Units (2 virtual cores with 2 EC2 Compute Units
>     >    each)
>     > -64-bit platform
>     What OS?
>     > (PostgreSQL 8.1.3)
>     Why use such an antiquated, buggy version?  Newer versions are
>     faster.
>     -Kevin
> I'd agree with trying to use the latest version you can.
> How are you loading this data?  I'd make sure you haven't got any 
> indices, primary keys, triggers or constraints on your tables before 
> you begin the initial load, just add them after.  Also use either the 
> COPY command for loading, or prepared transactions.  Individual insert 
> commands will just take way too long.
> Regards
> Thom

In response to


pgsql-performance by date

Next:From: Craig JamesDate: 2009-12-07 19:21:22
Subject: Re: Load experimentation
Previous:From: Scott CareyDate: 2009-12-07 19:10:13
Subject: Re: RAID card recommendation

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group