From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, PgHacker <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: varlena beyond 1GB and matrix |
Date: | 2016-12-07 23:36:47 |
Message-ID: | 2185.1481153807@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Wed, Dec 7, 2016 at 8:50 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>> I like to propose a new optional type handler 'typserialize' to
>> serialize an in-memory varlena structure (that can have indirect
>> references) to on-disk format.
> I think it's probably a mistake to conflate objects with substructure
> with objects > 1GB. Those are two somewhat orthogonal needs.
Maybe. I think where KaiGai-san is trying to go with this is being
able to turn an ExpandedObject (which could contain very large amounts
of data) directly into a toast pointer or vice versa. There's nothing
really preventing a TOAST OID from having more than 1GB of data
attached, and if you had a side channel like this you could transfer
the data without ever having to form a larger-than-1GB tuple.
The hole in that approach, to my mind, is that there are too many places
that assume that they can serialize an ExpandedObject into part of an
in-memory tuple, which might well never be written to disk, or at least
not written to disk in a table. (It might be intended to go into a sort
or hash join, for instance.) This design can't really work for that case,
and unfortunately I think it will be somewhere between hard and impossible
to remove all the places where that assumption is made.
At a higher level, I don't understand exactly where such giant
ExpandedObjects would come from. (As you point out, there's certainly
no easy way for a client to ship over the data for one.) So this feels
like a very small part of a useful solution, if indeed it's part of a
useful solution at all, which is not obvious.
FWIW, ExpandedObjects themselves are far from a fully fleshed out
concept, one of the main problems being that they don't have very long
lifespans except in the case that they're the value of a plpgsql
variable. I think we would need to move things along quite a bit in
that area before it would get to be useful to think in terms of
ExpandedObjects containing multiple GB of data. Otherwise, the
inevitable flattenings and re-expansions are just going to kill you.
Likewise, the need for clients to be able to transfer data in chunks
gets pressing well before you get to 1GB. So there's a lot here that
really should be worked on before we try to surmount that barrier.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2016-12-07 23:49:57 | Re: Back-patch use of unnamed POSIX semaphores for Linux? |
Previous Message | Alvaro Herrera | 2016-12-07 23:27:46 | Re: Back-patch use of unnamed POSIX semaphores for Linux? |