I could find one more issue when we apply largeobject-style interfaces
on generic toasted varlena data.
When we fetch a toasted datum, it scans the pg_toast_%u with SnapshotToast,
because it assumes any toasted chunks don't have multiple versions, and
visibility of the toast pointer always means visibility of the toast chunks.
However, if we provide largeobject-style interfaces which allow partial
updates on toasted varlena, it seems to me this assumption will get being
Is there any good idea?
KaiGai Kohei wrote:
> I concluded that the following issues should be solved when we apply
> largeobject-like interfaces on the big toasted data within general
> relations, not only pg_largeobject system catalog.
> At first, we need to add a new strategy to store the given varlena data
> on the external toast relation.
> If we try to seek and fetch a certain data chunk, it is necessary to be
> computable what chunk stores the required data specified by offset and
> length. So, the external chunks should be uncompressed always. It is a
> common requirement for both of read and write operations.
> If we try to update a part of the toasted data chunks, it should not be
> inlined independent from length of the datum, because we need to update
> whole the tuple which contains inlined toasted chunks in this case.
> If we open the toasted varlena with read-only mode, inlined one does not
> prevent anything. It is an issue for only write operation.
> I would like to add a new strategy on pg_type.typstorage with the following
> 1. It always stores the given varlena data without any compression.
> So, the given data is stored as a set of fixed-length chunks.
> 2. It always stores the given varlena data on external toast relation.
> I suggest a new built-in type named BLOB which has an identical definition
> to BYTEA type, expect for its attstorage.
> Next, a different version of lo_open() should be provided to accept
> BLOB type as follows:
> SELECT pictname, lo_open(pictdata, x'20000'::int) FROM my_picture;
> It will allocate a largeobject descriptor for the given BLOB data,
> and user can read and write using loread() and lowrite() interfaces.
> In this case, should it hold the relation handler and locks on the
> "my_picture" relation, not only its toast relation?
> Should the lo_open() with read-only mode be available on the existing
> TEXT or BYTEA types? I could not find any reason to deny them.
> Next, pg_largeobject system catalog can be redefined using the BLOB
> type as follows:
> Oid loowner; /* OID of the largeobject owner */
> Oid lonsp; /* OID of the largeobject namespace */
> aclitem loacl; /* access permissions */
> blob lodata; /* contents of the largeobject */
> } FormData_pg_largeobject;
> The existing largeobject interfaces perform on pg_largeobject.lodata
> specified by largeobject identifier.
> Rest of metadata can be used for access control purpose.
> KaiGai Kohei wrote:
>> Tom Lane wrote:
>>> Bernd Helmle <mailings(at)oopsware(dot)de> writes:
>>>> It might be interesting to dig into your proposal deeper in conjunction
>>>> with TOAST (you've already mentioned this TODO). Having serial access with
>>>> a nice interface into TOAST would be eliminating the need for
>>>> pg_largeobject completely (i'm not a big fan of this one-big-system-table
>>>> approach the old LO interface currently is).
>>> Yeah, it would be more useful probably to fix that than to add
>>> decoration to the LO facility. Making LO more usable is just going to
>>> encourage people to bump into its other limitations (32-bit OIDs,
>>> 32-bit object size, finite maximum size of pg_largeobject, lack of
>>> dead-object cleanup, etc etc).
>> The reason why I tried to mention the named largeobject feature is
>> that dac security checks on largeobject require them to belong to
>> a certain schema, so I thought it is quite natural to have a string
>> name. However, obviously, it is not a significant theme for me.
>> I can also agree your opinion that largeobject interfaces should be
>> redefined to access partial stuff of TOAST'ed verlena data structure,
>> not only pg_largeobject.
>> In this case, we will need a new pg_type.typstorage option which
>> force to put the given verlena data on external relation without
>> compression, because we cannot estimate the data offset in inlined
>> or compressed external verlena data.
>> I'll try to submit a design within a few days.
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
In response to
pgsql-hackers by date
|Next:||From: Greg Stark||Date: 2009-07-02 00:17:42|
|Subject: Re: pg_migrator versus inherited columns|
|Previous:||From: Robert Haas||Date: 2009-07-01 23:45:38|
|Subject: Re: 8.5 development schedule|