Re: varlena beyond 1GB and matrix

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: varlena beyond 1GB and matrix
Date: 2016-12-08 04:01:25
Message-ID: CADyhKSWxhpUCRPnMXw0Ef8Ly-8aCAqiPOTQM5s3+vRyOFQwe=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2016-12-08 8:36 GMT+09:00 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Wed, Dec 7, 2016 at 8:50 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>>> I like to propose a new optional type handler 'typserialize' to
>>> serialize an in-memory varlena structure (that can have indirect
>>> references) to on-disk format.
>
>> I think it's probably a mistake to conflate objects with substructure
>> with objects > 1GB. Those are two somewhat orthogonal needs.
>
> Maybe. I think where KaiGai-san is trying to go with this is being
> able to turn an ExpandedObject (which could contain very large amounts
> of data) directly into a toast pointer or vice versa. There's nothing
> really preventing a TOAST OID from having more than 1GB of data
> attached, and if you had a side channel like this you could transfer
> the data without ever having to form a larger-than-1GB tuple.
>
> The hole in that approach, to my mind, is that there are too many places
> that assume that they can serialize an ExpandedObject into part of an
> in-memory tuple, which might well never be written to disk, or at least
> not written to disk in a table. (It might be intended to go into a sort
> or hash join, for instance.) This design can't really work for that case,
> and unfortunately I think it will be somewhere between hard and impossible
> to remove all the places where that assumption is made.
>
Regardless of the ExpandedObject, does the flatten format need to
contain fully flatten data chunks?
If a data type internally contains multiple toast pointers as like an array,
its flatten image is likely very small we can store using an existing
varlena mechanism.
One problem is VARSIZE() will never tell us exact total length of
the data even if it references multiple GB scale chunks.

> At a higher level, I don't understand exactly where such giant
> ExpandedObjects would come from. (As you point out, there's certainly
> no easy way for a client to ship over the data for one.) So this feels
> like a very small part of a useful solution, if indeed it's part of a
> useful solution at all, which is not obvious.
>
I expect an aggregate function that consumes millions of rows as source
of a large matrix larger than 1GB. Once it is formed to a variable, it is
easy to deliver as an argument of PL functions.

> FWIW, ExpandedObjects themselves are far from a fully fleshed out
> concept, one of the main problems being that they don't have very long
> lifespans except in the case that they're the value of a plpgsql
> variable. I think we would need to move things along quite a bit in
> that area before it would get to be useful to think in terms of
> ExpandedObjects containing multiple GB of data. Otherwise, the
> inevitable flattenings and re-expansions are just going to kill you.q
>
> Likewise, the need for clients to be able to transfer data in chunks
> gets pressing well before you get to 1GB. So there's a lot here that
> really should be worked on before we try to surmount that barrier.
>
Do you point out the problem around client<->server protocol, isn't it?
Likely, we eventually need this enhancement. I agree.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Karl O. Pinc 2016-12-08 04:02:49 Re: Patch to implement pg_current_logfile() function
Previous Message Kohei KaiGai 2016-12-08 03:44:11 Re: varlena beyond 1GB and matrix