Re: pg_dump performance

From: Jared Mauch <jared(at)puck(dot)nether(dot)net>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Jared Mauch <jared(at)puck(dot)nether(dot)net>, pgsql-performance(at)postgresql(dot)org
Subject: Re: pg_dump performance
Date: 2007-12-26 21:49:59
Message-ID: 20071226214958.GA92245@puck.nether.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Dec 26, 2007 at 11:35:59PM +0200, Heikki Linnakangas wrote:
> I run a quick oprofile run on my laptop, with a table like that, filled
> with dummy data. It looks like indeed ~30% of the CPU time is spent in
> sprintf, to convert the integers and inets to string format. I think you
> could speed that up by replacing the sprintf calls in int2, int4 and inet
> output functions with faster, customized functions. We don't need all the
> bells and whistles of sprintf, which gives the opportunity to optimize.

Hmm. Given the above+below perhaps there's something that can
be tackled in the source here.. will look at poking around in there ...
our sysadmin folks don't like the idea of running patched stuff (aside from
conf changes) as they're concerned about losing patches btw upgrades.

I'm waiting on one of my hosts in Japan to come back online
so perhaps I can hack the source and attempt some optimization
after that point. It's not the beefy host that I have this on
though and not even multi-{core,cpu} so my luck may be poor.

> A binary mode dump should go a lot faster, because it doesn't need to do
> those conversions, but binary dumps are not guaranteed to work across
> versions.

I'll look at this. Since this stuff is going into something else
perhaps I can get it to be slightly faster to not convert from binary ->
string -> binary(memory) again. A number of the columns are unused in my
processing and some are used only when certain criteria are met (some
are always used).

> BTW, the profiling I did earlier led me to think this should be optimized
> in the compiler. I started a thread about that on the gcc mailing list but
> got busy with other stuff and didn't follow through that idea:
> http://gcc.gnu.org/ml/gcc/2007-10/msg00073.html

(* drift=off mode=drifting-fast *)
I'd have to say after a quick review of this, it does look
like they're right and it should go somewhat in the C lib. I'm on
Solaris 10 with my host. There may be some optimizations that the compiler
could do when linking the C library but I currently think they're on
sound footing.

(* drift=off mode=end *)

- Jared

--
Jared Mauch | pgp key available via finger from jared(at)puck(dot)nether(dot)net
clue++; | http://puck.nether.net/~jared/ My statements are only mine.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Guillaume Smet 2007-12-26 21:52:04 Re: More shared buffers causes lower performances
Previous Message Mark Mielke 2007-12-26 21:40:40 Re: With 4 disks should I go for RAID 5 or RAID 10