Skip site navigation (1) Skip section navigation (2)

Re: Very poor performance loading 100M of sql data using copy

From: John Rouillard <rouilj(at)renesys(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Very poor performance loading 100M of sql data using copy
Date: 2008-04-29 15:16:22
Message-ID: 20080429151622.GP6622@renesys.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Mon, Apr 28, 2008 at 02:16:02PM -0400, Greg Smith wrote:
> On Mon, 28 Apr 2008, John Rouillard wrote:
> 
> >   2008-04-21 11:36:43 UTC @(2761)i: LOG: checkpoints ... (27 seconds 
> >   apart)
> > so I changed:
> >  checkpoint_segments = 30
> >  checkpoint_warning = 150
> 
> That's good, but you might go higher than 30 for a bulk loading operation 
> like this, particularly on 8.1 where checkpoints are no fun.  Using 100 is 
> not unreasonable.

Ok. I can do that. I chose 30 to make the WAL logs span the 5 minute

  checkpoint_timeout = 300

so that the 30 segments wouldn't wrap over before the 5 minute
checkpoint that usually occurs. Maybe I should increase both the
timeout and the segments?
 
> >shared_buffers = 3000
> >I don't see any indication in the docs that increasing shared memory
> >would help speed up a copy operation.
> 
> The index blocks use buffer space, and what ends up happening if there's 
> not enough memory is they are written out more than they need to be (and 
> with your I/O hardware you need to avoid writes unless absolutely 
> necessary).

I forgot to mention the raid 1/0 is on a 3ware 9550SX-4LP raid card
setup as raid 1/0. The write cache is on and autoverify is turned off.

> Theoretically the OS is caching around that situation but 
> better to avoid it. 

The system is using 6-8MB of memory for cache.

> You didn't say how much RAM you have,

16GB total, but 8GB or so is taken up with other processes.

> but you should 
> start by a factor of 10 increase to 30,000 and see if that helps; if so, 
> try making it large enough to use 1/4 of total server memory.  3000 is 
> only giving the server 24MB of RAM to work with, and it's unfair to expect 
> it to work well in that situation.

So swap the memory usage from the OS cache to the postgresql process.
Using 1/4 as a guideline it sounds like 600,000 (approx 4GB) is a
better setting. So I'll try 300000 to start (1/8 of memory) and see
what it does to the other processes on the box.
 
> While not relevant to this exercise you'll need to set 
> effective_cache_size to a useful value one day as well.

This is a very lightly loaded database, a few queries/hour usually
scattered across the data set, so hopefully that won't be much of an
issue.

-- 
				-- rouilj

John Rouillard
System Administrator
Renesys Corporation
603-244-9084 (cell)
603-643-9300 x 111

In response to

Responses

pgsql-performance by date

Next:From: Greg SmithDate: 2008-04-29 15:58:00
Subject: Re: Very poor performance loading 100M of sql data using copy
Previous:From: John RouillardDate: 2008-04-29 15:04:32
Subject: Re: Very poor performance loading 100M of sql data using copy

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group