Re: varlena beyond 1GB and matrix

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: varlena beyond 1GB and matrix
Date: 2016-12-07 23:04:09
Message-ID: CA+TgmobGK6vi7QrroqDrJ0zPufTmzbcNzmrrwCY6=xYckBM+TQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 7, 2016 at 8:50 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
> I like to propose a new optional type handler 'typserialize' to
> serialize an in-memory varlena structure (that can have indirect
> references) to on-disk format.
> If any, it shall be involced on the head of toast_insert_or_update()
> than indirect references are transformed to something other which
> is safe to save. (My expectation is, the 'typserialize' handler
> preliminary saves the indirect chunks to the toast relation, then
> put toast pointers instead.)

This might not work. The reason is that we have important bits of
code that expect that they can figure out how to do some operation on
a datum (find its length, copy it, serialize it) based only on typlen
and typbyval. See src/backend/utils/adt/datum.c for assorted
examples. Note also the lengthy comment near the top of the file,
which explains that typlen > 0 indicates a fixed-length type, typlen
== -1 indicates a varlena, and typlen == -2 indicates a cstring. I
think there's probably room for other typlen values; for example, we
could have typlen == -3 indicate some new kind of thing -- a
super-varlena that has a higher length limit, or some other kind of
thing altogether.

Now, you can imagine trying to treat what you're talking about as a
new type of TOAST pointer, but I think that's not going to help,
because at some point the TOAST pointer gets de-toasted into a varlena
... which is still limited to 1GB. So that's not really going to
work. And it brings out another point, which is that if you define a
new typlen code, like -3, for super-big things, they won't be
varlenas, which means they won't work with the existing TOAST
interfaces. Still, you might be able to fix that. You would probably
have to do some significant surgery on the wire protocol, per the
commit message for fa2fa995528023b2e6ba1108f2f47558c6b66dcd.

I think it's probably a mistake to conflate objects with substructure
with objects > 1GB. Those are two somewhat orthogonal needs. As Jim
also points out, expanded objects serve the first need. Of course,
those assume that we're dealing with a varlena, so if we made a
super-varlena, we'd still need to create an equivalent. But perhaps
it would be a fairly simple adaptation of what we've already got.
Handling objects >1GB at all seems like the harder part of the
problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message legrand legrand 2016-12-07 23:21:46 Re: Partitionning: support for Truncate Table WHERE
Previous Message Tom Lane 2016-12-07 22:56:11 Re: pg_dump vs. TRANSFORMs