Re: pg_dump / copy bugs with "big lines" ?

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Daniel Verite <daniel(at)manitou-mail(dot)org>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Ronan Dunklau <ronan(dot)dunklau(at)dalibo(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_dump / copy bugs with "big lines" ?
Date: 2016-11-28 17:00:29
Message-ID: 20161128170029.npsjil5emwf4uojc@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Daniel Verite wrote:

> And in enlargeStringInfo() the patch adds this:
> /*
> * Maximum size for strings with allow_long=true.
> * It must not exceed INT_MAX, as the StringInfo routines
> * expect offsets into the buffer to fit into an int.
> */
> const int max_alloc_long = 0x7fffffff;
>
> On a 32-bit arch, we can expect max_alloc_long == MaxAllocHugeSize,
> but on a 64-bit arch, it will be much smaller with MaxAllocHugeSize
> at (2^63)-1.

Yeah, you're right. Here's a v4 of this patch, in which I've done some
further very small adjustments to your v3:

* I changed the 0x7fffffff hardcoded value with some arithmetic which
sholud have the same result on most architectures:

/*
* Determine the upper size limit. The fact that the StringInfo API uses
* "int" everywhere limits us to int's width even for "long_ok" strings.
*/
limit = str->long_ok ?
(((Size) 1) << (Min(sizeof(int), sizeof(Size)) * 8 - 1)) - 1 :
MaxAllocSize;

Initially I just had "sizeof(int)" instead of the Min(), and a
StaticAssert for sizeof(int) <= sizeof(Size), on the grounds that no
actual platform is likely to break that (though I think the C standard
does allow it). But I realized that doing it this way is simple enough;
and furthermore, in any platforms where int is 8 bytes (ILP64), this
would automatically allow for 63-bit-sized stringinfos. I don't think
this exists today, but wikipedia[1] lists two obsolete ones, [2] and [3].

[1] https://en.wikipedia.org/wiki/64-bit_computing#64-bit_data_models
[2] https://en.wikipedia.org/wiki/HAL_SPARC64
[3] https://en.wikipedia.org/wiki/UNICOS

* I reverted the enlargement looping logic in enlargeStringInfo() to
pretty much the original code (minus the cast).

* I tweaked a few comments.

The big advantage of your v3 patch is that it can be backpatched without
fear of breaking ABI, so I've struggled to maintain that property in my
changes. We can further tweak in HEAD-only; for example change the API
to use Size instead of int. I think that would be desirable, but let's
not do it until we have backpatched this one.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
huge-stringinfo-v4.patch text/plain 7.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-11-28 17:04:44 Re: Wrong order of tests in findDependentObjects()
Previous Message Pavel Stehule 2016-11-28 16:56:41 Re: Tackling JsonPath support