Skip site navigation (1) Skip section navigation (2)

Re: Slow dump with pg_dump/pg_restore ? How to improve ?

From: Soeren Gerlach <soeren(at)all-about-shift(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Slow dump with pg_dump/pg_restore ? How to improve ?
Date: 2004-06-30 22:08:46
Message-ID: 200407010008.46744.soeren@all-about-shift.com (view raw or flat)
Thread:
Lists: pgsql-general
> > So...the dump in the above format needs some 14 minutes, the restore 10
> > minutes. This seems to be very slow as it means something like 100K/sec
> > for dumping and restoring. The drive is cappable of 40 Meg/seconds, so
> > thats not the bottleneck ,-) Anyhow postmaster and pg_dump seem to max
> > out the CPU cycles as it's running at nearly 100% while dumping and
> > restoring.
>
> What datatypes have you got in the large tables?  Also, what character
> set encoding are you using?

datestyle = 'ISO,European' ?

> The only reason I can think of for dump to be that slow is if conversion
> of the data to text is a big time sink.  This would involve the
> datatype's own output routine plus possibly a character set conversion.
> You should at least make sure that no character set conversion needs to
> happen (offhand I think this would only be an issue if pg_dump is
> invoked with PGCLIENTENCODING set in its environment).

This are the schemas of the two big tables mentioned:

-------------------------------------------------------------------------
CREATE TABLE public.agent_action_history
(
  aah_tag_id bigserial NOT NULL,
  aah_action_type int2 NOT NULL DEFAULT 0,
  aah_quantity float4 NOT NULL DEFAULT 0,
  aah_price float4 NOT NULL DEFAULT 0,
  aah_sim_flag bool NOT NULL DEFAULT false,
  aah_timestamp timestamp NOT NULL,
  aah_action_pl float4 NOT NULL DEFAULT 0
) WITHOUT OIDS;

CREATE TABLE public.tick_history
(
  tkh_id serial NOT NULL,
  tkh_cdt_id int8 NOT NULL,
  tkh_price float4 NOT NULL,
  tkh_price_type int2 NOT NULL,
  tkh_volume float4 NOT NULL,
  tkh_system_time timestamp(6) NOT NULL,
  tkh_exchange_time timestamp(6) NOT NULL,
  CONSTRAINT "PK_tick_history" PRIMARY KEY (tkh_id)
) WITHOUT OIDS;
-------------------------------------------------------------------------

> Also I trust you are using dump with the default COPY-style output,
> not dump-as-INSERTs?

Yes. I'm dumping using "-c -Fc -v" as options. I just rerun the whole:

 * Total rows: 904,000
 * Dumping plain format: 21min == 1260 sec == 717 rows/sec
 * Dumping "-c -Fc" format: 26min == 1560 sec == 580 rows/sec
 * pg_dump takes 2/3 of the [single] CPU, postmaster the other 1/3 for both
   dumps
 * the resulting "-Fc" dump is 1/10 of the size of the plain dump

It would be interesting to know from people with DBs > 1 Gig how fast they 
can dump in terms of rows/second.

Another info: I just noticed that I've 7.4.1 running not 7.4.3. So I'll 
retest this tomorrow again and provide you with numbers for 7.4.3 too.


I'm likely to install Sybase as a reference to Postgres next week. But I can 
remember to got there something like 5-10,000 rows/sec some one year ago 
when I tested a 11.5 version for this issue too.



Thanks,
Soeren Gerlach


In response to

Responses

pgsql-general by date

Next:From: Alvaro HerreraDate: 2004-06-30 22:10:26
Subject: Re: Internationalization
Previous:From: Chris BrowneDate: 2004-06-30 21:43:02
Subject: Re: what's best: opteron or xeon for pg 7.4.3?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group