Re: 4B row limit for CLOB tables

From: Roger Pack <rogerdpack2(at)gmail(dot)com>
To: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: 4B row limit for CLOB tables
Date: 2015-01-29 21:06:13
Message-ID: CAL1QdWfb-p5kE9DT2pMqBxohaKG=vxmDREmSBjc+7TkbOeKBBw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Forgot to reply all on this one, many thanks to Steve Adrian and Bill
for their answers.

On Jan 29, 2015, at 12:32 PM, Roger Pack <rogerdpack2(at)gmail(dot)com> wrote:

> On 1/29/15, Steve Atkins <steve(at)blighty(dot)com> wrote:
>>
>> On Jan 29, 2015, at 9:53 AM, Roger Pack <rogerdpack2(at)gmail(dot)com> wrote:
>>
>>> On 1/29/15, Roger Pack <rogerdpack2(at)gmail(dot)com> wrote:
>>>> Hello. I see on this page a mention of basically a 4B row limit for
>>>> tables that have BLOB's
>>>
>>> Oops I meant for BYTEA or TEXT columns, but it's possible the
>>> reasoning is the same...
>>
>> It only applies to large objects, not bytea or text.
>>
>>>> https://wiki.postgresql.org/wiki/BinaryFilesInDB
>>
>> Some of that looks incorrect or out of date. (e.g. large objects can be a
>> lot
>> bigger than 2GB in 9.3+).
>>
>>
>>>>
>>>> Is this fact mentioned in the documentation anywhere? Is there an
>>>> official source for this? (If not, maybe consider this a feature
>>>> request to mention it in the documentation on BLOB).
>>>> Cheers and thanks.
>>>> -roger
>>
>> I'm not sure whether it's mentioned explicitly, but large objects are
>> referenced by an OID, which is a 32 bit value (and a global resource).
>
> Thanks for the info, precisely what I was looking for.
> As a following up, could you elaborate on what you mean by "global
> resource"? I believe OID's are generated from a global counter. Does
> this mean the maximum number of large objects in the database is 4B?

Well, OIDs are generated from a global counter, but there's nothing to
stop that wrapping around.

OIDs are used in a lot of places in the system - to identify tables,
and functions
and loaded modules and suchlike, and duplicates are prevented
by unique indexes or similar. But that means that if the OID counter were
to wrap around and return an OID that was already in use in the same
context then the attempt to use it would fail. For instance, you might
try to create a table, and it would fail because the "next" OID was already
used to specify another table.

Wrapping the OID counter around will cause all sorts of things to break.

Use of OIDs by the user (as opposed to by postgresql itself for internal
bookkeeping) has been deprecated for years.

That's one reason, but not the only reason, that I don't believe anyone
should every use the postgresql large object infrastructure. For small
(megabytes rather than gigabytes) chunks of data that might be processed
in the database or might not, bytea or text types are the right thing. For
anything larger, or anything that's not actually processed within the database
(e.g. images or PDFs handled by a webapp) then leaving the file on the
filesystem and just storing metadata in the database is usually the right
thing.

> If you actually had that many BLOB's (and the counter wrapped) I
> assume that lo_create would start failing [i.e. it has some kind of
> uniqueness constraint on the oid]? Or something like that?

lo_create() would fail, but so would the rest of the database. Nothing
would work.

A billion large objects is too many. (Exactly a billion too many, in fact).

Cheers,
Steve

In response to

Browse pgsql-general by date

  From Date Subject
Next Message John R Pierce 2015-01-29 21:13:19 Re: oracle to postgres
Previous Message Adrian Klaver 2015-01-29 21:02:04 Re: oracle to postgres

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2015-01-29 21:09:31 Re: Proposal: two new role attributes and/or capabilities?
Previous Message Jim Nasby 2015-01-29 20:57:58 Re: PATCH: decreasing memory needlessly consumed by array_agg