Re: Migration study, step 1: bulk write performance optimization

From: Dave Cramer <pg(at)fastcrypt(dot)com>
To: Mikael Carneholm <Mikael(dot)Carneholm(at)WirelessCar(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Migration study, step 1: bulk write performance optimization
Date: 2006-03-20 17:44:31
Message-ID: FB595DE3-F369-466D-977C-1BE87849D1A1@fastcrypt.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Others are reporting better performance on 8.1.x with very large
shared buffers. You may want to try tweaking that possibly as high as
20% of available memory

Dave
On 20-Mar-06, at 9:59 AM, Mikael Carneholm wrote:

> Ok, here's the deal:
>
> I am responisble for an exciting project of evaluating migration of
> a medium/large application for a well-known swedish car&truck
> manufacturer from a proprietary DB to Postgres. The size of the
> database is currently about 50Gb, annual growth depending on sales,
> but probably in the 30-50Gb range.
>
> Migrating the schema was easily done, mostly involving a search/
> replace of some vendor specific datatypes. The next step is to
> migrate the data itself, and for this we have written a Java app
> relying on JDBC metadata to map the tables in the source schema to
> the target schema. The goal right now is to find the set of
> parameters that gives as short bulk insert time as possible,
> minimizing downtime while the data itself is migrated.
>
> The machine used for the study is a Dell PE2850, 6GB memory, 1xXEON
> 3.0GHz/2MB cache, internal SCSI 0+1 raid (currently 4x36GB 10000rpm
> striped+mirrored, two more 146GB 15000rpm disks will arrive later).
> Not sure about the brand/model of the raid controller, so I'll
> leave that for now. File system is ext3(I know, maybe not the
> optimal choice but this is how it was when I got it) with a 8k
> block size. The OS currently installed is CentOS4.
>
> Until the new disks arrive, both the OS itself, pg_xlog and the
> data reside on the same disks. When they arrive, I will probably
> move the data to the new disks (need two more to get raid 0+1,
> though) and leave the OS + pg_xlog on the 10000rpm disks. Mounting
> the 15000rpm data disks with the noatime option (this is safe,
> right?) and using a 16kb block size (for read performance) will
> probably be considered as well.
>
> NOTE: this machine/configuration is NOT what we will be using in
> production if the study turns out OK, it's just supposed to work as
> a development machine in the first phase whose purpose more or less
> is to get familiar with configurating Postgres and see if we can
> get the application up and running (we will probably use a 64bit
> platform and either a FC SAN or internal raid with a battery backed
> cache for production use, if all goes well).
>
> The first thing I did when I got the machine was to do a raw dd
> write test:
>
> # time bash -c "(dd if=/dev/zero of=/opt/bigfile count=1310720
> bs=8k && sync)"
> 1310720+0 records in
> 1310720+0 records out
>
> real 2m21.438s
> user 0m0.998s
> sys 0m51.347s
>
> (10*1024)Mb/~141s => ~75.5Mb/s
>
> As a simple benchmark, I created a simple table without PK/indexes
> with 1k wide rows:
>
> create table iotest.one_kb_rows
> (
> the_col char(1024) not null
> );
>
> To fill the table, I use this simple function:
>
> create or replace function iotest.writestress(megs integer) returns
> void as $$
> declare
> char_str char(1024) := repeat('x', 1024);
> begin
> for i in 1..megs loop
> for j in 1..1024 loop
> insert into one_kb_rows(the_col) values (char_str);
> end loop;
> end loop;
> end;
> $$
> language plpgsql;
>
> Then, I tested how long it takes to write 10Gb of data to this table:
>
> iotest=> \timing
> Timing is on.
>
> iotest=> select writestress((10*1024));
> writestress
> -------------
>
> (1 row)
>
> Time: 379971.252 ms
>
> This gives that 10Gb takes ~380s => ~27Mb/s (with fsync=off),
> compared to the raw dd result (~75.5Mb/s).
>
> I assume this difference is due to:
> - simultaneous WAL write activity (assumed: for each byte written
> to the table, at least one byte is also written to WAL, in effect:
> 10Gb data inserted in the table equals 20Gb written to disk)
> - lousy test method (it is done using a function => the transaction
> size is 10Gb, and 10Gb will *not* fit in wal_buffers :) )
> - poor config
> - something else?
>
> I have tried to read up as much as possible on Postgres
> configuration (disk layout, buffer management, WAL sync methods,
> etc) and found this post regarding bgwriter tweaking: http://
> archives.postgresql.org/pgsql-performance/2006-03/msg00218.php -
> which explains the bgwriter config below.
>
> All params in postgresql.conf that are not commented out:
> ---------------------------------------------------------
> max_connections = 100
> superuser_reserved_connections = 2
> shared_buffers = 16000
> bgwriter_lru_percent = 20
> bgwriter_lru_maxpages = 160
> bgwriter_all_percent = 10
> bgwriter_all_maxpages = 320
> fsync = off
> wal_sync_method = open_sync
> wal_buffers = 128
> checkpoint_segments = 3
> log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
> log_rotation_age = 1440
> log_line_prefix = '%m: (%u(at)%d) '
> lc_messages = 'C'
> lc_monetary = 'C'
> lc_numeric = 'C'
> lc_time = 'C'
>
> fsync can safely be kept off during data migration as we are able
> to restart the procedure without losing data if something goes
> wrong. Increasing chekpoint_segments to 8/16/32 only increased the
> insert time, so I kept it at the default. I will increase
> shared_buffers and effective_cache_size as soon as it's time to
> tweak read performance, but for now I'm just focusing on write
> performance.
>
>
> Postgres version used:
>
> iotest=> select version();
> version
> ----------------------------------------------------------------------
> -----------------------------
> PostgreSQL 8.1.3 on i686-pc-linux-gnu, compiled by GCC gcc (GCC)
> 3.4.4 20050721 (Red Hat 3.4.4-2)
> (1 row)
>
>
> I want to make sure I have made correct assumptions before I carry
> on, so comments are welcome.
>
> - Mikael
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Vivek Khera 2006-03-20 19:04:23 Re: 1 TB of memory
Previous Message PFC 2006-03-20 16:35:00 Re: Migration study, step 1: bulk write performance optimization