Does pg_dump have a 2GB file-size limit?

From: David Schnur <dnschnur(at)gmail(dot)com>
To: pgsql-admin(at)postgresql(dot)org
Subject: Does pg_dump have a 2GB file-size limit?
Date: 2009-09-14 17:39:54
Message-ID: 50000b2e0909141039x1b0c4a9cyac03934fa4c50316@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

I develop a piece of software that uses PostgreSQL (8.3.5) as a back-end
database. The software can, optionally, use pg_dump to create snapshots of
the database. One user has run into a problem where pg_dump dumps 2GB, then
claims that the archive is too large. I haven't yet found documentation for
pg_dump's internal limits, and was wondering if anyone could shed light on
this. I'm not an expert on Linux filesystems, so perhaps I'm just missing
something obvious, but I was under the impression that default ext3, with a
4K block size, supported files up to 2TB in size.
Command-line used to invoke pg_dump:

pg_dump -v -F c -x -O -f "path/to/file" -h db_host -p db_port -U user_name
database_name

A temporary file is created in a secure location, and used with PGPASSFILE
to avoid the password prompt. Here is the error found in the pg_dump log:

pg_dump: dumping contents of table XYZ (edited name)
pg_dump: [custom archiver] could not write to output file: File too large
pg_dump: *** aborted because of error

The left-over partially-created file is a suspicious 2147483647 bytes in
size. The DB is running on 32-bit RHEL 5.3 (Linux-2.6.18-128.1.16). I
don't know the libc version at the moment, but it would be whatever was
shipped with the distro. Here is the output of ulimit on that system:

core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 77823
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 77823
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

The pg_dump binary was compiled on a different system, 32-bit RHEL4
(Linux-2.6.9-78.0.1) against glibc 2.3.4.

I don't have exact numbers at the moment, but the database is approximately
150 GB in size. Roughly 2/3rds of that is occupied by one table with
approximately 2 billion rows, which was dumped successfully according to the
pg_dump log. Most of the remainder is occupied by table XYZ, with
approximately 200 million rows.

Since the software is cross-platform, I'd prefer to avoid work-arounds such
as "pipe it through split" unless tracking down the problem itself fails.
Also, since the machine running the database belongs to a user, I'm limited
in what I can "try out" on it. If changes are needed for a solution, that's
fine, but changes purely for diagnosis, i.e. "try X and see if that works",
are very difficult. I'm aware that the kernel/libc that we build against is
rather old, but it shouldn't be so old as to be a problem. Although I have
more control over that machine, updating it unnecessarily is undesirable,
and I can only do it if it would contribute to a solution. Finally,
migrating to PG 8.4 is not an option in the short-term, unless it would
definitely fix the problem.

I know these restrictions may make it much harder to diagnose the issue, so
I apologize in advance. Thank you for any help you can provide,

David

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2009-09-14 17:53:53 Re: Does pg_dump have a 2GB file-size limit?
Previous Message Michael Monnerie 2009-09-13 16:23:07 Re: type cast from bytea to varchar or whatever