Re: PostgreSQL 8.4 performance tuning questions

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Scott Carey <scott(at)richrelevance(dot)com>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Matthew Wakeling <matthew(at)flymine(dot)org>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: PostgreSQL 8.4 performance tuning questions
Date: 2009-07-31 17:04:52
Message-ID: 22574.1249059892@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Greg Stark <gsstark(at)mit(dot)edu> writes:
> On Thu, Jul 30, 2009 at 11:30 PM, Tom Lane<tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I did some tracing and verified that pg_dump passes data to deflate()
>> one table row at a time. I'm not sure about the performance
>> implications of that, but it does seem like it might be something to
>> look into.

> I suspect if this was a problem the zlib people would have added
> internal buffering ages ago. I find it hard to believe we're not the
> first application to use it this way.

I dug into this a bit more. zlib *does* have internal buffering --- it
has to, because it needs a minimum lookahead of several hundred bytes
to ensure that compression works properly. The per-call overhead of
deflate() looks a bit higher than one could wish when submitting short
chunks, but oprofile shows that "pg_dump -Fc" breaks down about like
this:

samples % image name symbol name
1103922 74.7760 libz.so.1.2.3 longest_match
215433 14.5927 libz.so.1.2.3 deflate_slow
55368 3.7504 libz.so.1.2.3 compress_block
41715 2.8256 libz.so.1.2.3 fill_window
17535 1.1878 libc-2.9.so memcpy
13663 0.9255 libz.so.1.2.3 adler32
4613 0.3125 libc-2.9.so _int_malloc
2942 0.1993 libc-2.9.so free
2552 0.1729 libc-2.9.so malloc
2155 0.1460 libz.so.1.2.3 pqdownheap
2128 0.1441 libc-2.9.so _int_free
1702 0.1153 libz.so.1.2.3 deflate
1648 0.1116 libc-2.9.so mempcpy

longest_match is the core lookahead routine and is not going to be
affected by submission sizes, because it isn't called unless adequate
data (ie, the longest possible match length) is available in zlib's
internal buffer. It's possible that doing more buffering on our end
would reduce the deflate_slow component somewhat, but it looks like
the most we could hope to get that way is in the range of 10% speedup.
So I'm wondering if anyone can provide concrete evidence of large
wins from buffering zlib's input.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message PFC 2009-07-31 23:01:31 Re: PostgreSQL 8.4 performance tuning questions
Previous Message Merlin Moncure 2009-07-31 15:11:57 Re: PostgreSQL 8.4 performance tuning questions