Re: pg_dump / copy bugs with "big lines" ?

From: "Daniel Verite" <daniel(at)manitou-mail(dot)org>
To: "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>,"Jim Nasby" <Jim(dot)Nasby(at)bluetreble(dot)com>,"Ronan Dunklau" <ronan(dot)dunklau(at)dalibo(dot)com>,"pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump / copy bugs with "big lines" ?
Date: 2016-03-01 16:43:00
Message-ID: 4023faba-72cc-4949-b11c-8c67a0ecf407@mm
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Alvaro Herrera wrote:

> If others can try this patch to ensure it enables pg_dump to work on
> their databases, it would be great.

It doesn't seem to help if one field exceeds 1Gb, for instance when
inflated by a bin->hex translation.

postgres=# create table big as
select pg_read_binary_file('data') as binarycol;

postgres=# select octet_length(binarycol) from big;
octet_length
--------------
1073700000

postgres=# copy big to '/var/tmp/big.copy';
ERROR: XX000: invalid memory alloc request size 2147400003
LOCATION: palloc, mcxt.c:903

Same problem with pg_dump.

OTOH, it improves the case where the cumulative size of field contents
for a row exceeds 1 Gb, but not any single field exceeds that size.

If splitting the table into 3 fields, each smaller than 512MB:

postgres=# create table big2 as select
substring(binarycol from 1 for 300*1024*1024) as b1,
substring(binarycol from 1+300*1024*1024 for 300*1024*1024) as b2 ,
substring(binarycol from 1+600*1024*1024 for 400*1024*1024) as b3
from big;

postgres=# copy big2 to '/var/tmp/big.copy';
COPY 1

then that works, producing a single line of 2097152012 chars
in the output file.

By contrast, it fails with an unpatched 9.5:

postgres=# copy big2 to '/var/tmp/big.copy';
ERROR: 54000: out of memory
DETAIL: Cannot enlarge string buffer containing 629145605 bytes by 629145602
more bytes.
LOCATION: enlargeStringInfo, stringinfo.c:260

If setting bytea_output to 'escape', it also fails with the patch applied,
as it tries to allocate 4x the binary field size, and it exceeds 1GB again.

postgres=# set bytea_output =escape;
SET
postgres=# copy big2 to '/var/tmp/big.copy';
ERROR: invalid memory alloc request size 1258291201
LOCATION: palloc, mcxt.c:821

1258291201 = 300*1024*1024*4+1

Also, the COPY of both tables work fine if using (FORMAT BINARY),
on both the patched and unpatched server.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-03-01 16:50:36 Re: pg_dump / copy bugs with "big lines" ?
Previous Message Pavel Stehule 2016-03-01 16:30:59 Re: PROPOSAL: Fast temporary tables