Re: pg_dump slow with bytea data

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: "chris r(dot)" <chricki(at)gmx(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump slow with bytea data
Date: 2011-03-02 14:41:59
Message-ID: AANLkTimS4=7OfwsYJS4RHTf0KzYNny5+APFGqD=K-ztz@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Mar 2, 2011 at 2:35 AM, chris r. <chricki(at)gmx(dot)net> wrote:
> Dear list,
>
> As discussed extensively in the past [1], pg_dump tends to be slow for
> tables that contain bytea columns with large contents. Starting with
> postgres version 8.5 the COPY format of bytea was changed from escape to
> hex [1], giving ~50% performance boost.
>
> However, we experience heavy problems during our weekly backup of our
> database recently. We suspect the reason for this is that we changed
> some columns from text with base64-encoded binary stuff to bytea
> columns. This change affected a large fraction of the database (~400
> GB). Note that we ran VACUUM FULL on the tables affected.
>
> After this change our backup procedure heavily slowed down. Whereas it
> took about 8 hours before the change, pg_dump is still busy with the
> first table (keeping roughly 50GB) after 12 hours of backup. If I
> approximate the time to complete the backup based on this, the backup
> procedure would require factor 10 the time it required before the
> change. The command we run is simply:  pg_dump -f <outputfile> -F c <db>
>
> The main reason for this immense slow-down was identified in [1] as the
> conversion of bytea into a compatible format (i.e. hex). However, given
> the size of the db, a factor 10 makes backups practically infeasible.

hm. where exactly is all this time getting spent? Are you i/o bound?
cpu bound? Is there any compression going on? Maybe this is a
performance issue inside pg_dump itself, not necessarily a text/binary
issue (i have a hard time believing going from b64->hex is 10x slower
on format basis alone). Can you post times comparing manual COPY via
text, manual COPY via binary, and pg_dump -F c?

merlin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2011-03-02 14:45:59 Re: Dynamic binding in plpgsql function
Previous Message Dmitriy Igrishin 2011-03-02 14:34:42 Re: Dynamic binding in plpgsql function