Re: [PATCH] [v8.5] Security checks on largeobjects

From: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bernd Helmle <mailings(at)oopsware(dot)de>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PATCH] [v8.5] Security checks on largeobjects
Date: 2009-07-01 23:59:58
Message-ID: 4A4BF87E.7010107@ak.jp.nec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I could find one more issue when we apply largeobject-style interfaces
on generic toasted varlena data.

When we fetch a toasted datum, it scans the pg_toast_%u with SnapshotToast,
because it assumes any toasted chunks don't have multiple versions, and
visibility of the toast pointer always means visibility of the toast chunks.

However, if we provide largeobject-style interfaces which allow partial
updates on toasted varlena, it seems to me this assumption will get being
incorrect.

Is there any good idea?

KaiGai Kohei wrote:
> I concluded that the following issues should be solved when we apply
> largeobject-like interfaces on the big toasted data within general
> relations, not only pg_largeobject system catalog.
>
> At first, we need to add a new strategy to store the given varlena data
> on the external toast relation.
> If we try to seek and fetch a certain data chunk, it is necessary to be
> computable what chunk stores the required data specified by offset and
> length. So, the external chunks should be uncompressed always. It is a
> common requirement for both of read and write operations.
> If we try to update a part of the toasted data chunks, it should not be
> inlined independent from length of the datum, because we need to update
> whole the tuple which contains inlined toasted chunks in this case.
> If we open the toasted varlena with read-only mode, inlined one does not
> prevent anything. It is an issue for only write operation.
>
> I would like to add a new strategy on pg_type.typstorage with the following
> characteristics:
> 1. It always stores the given varlena data without any compression.
> So, the given data is stored as a set of fixed-length chunks.
> 2. It always stores the given varlena data on external toast relation.
>
> I suggest a new built-in type named BLOB which has an identical definition
> to BYTEA type, expect for its attstorage.
>
> Next, a different version of lo_open() should be provided to accept
> BLOB type as follows:
>
> SELECT pictname, lo_open(pictdata, x'20000'::int) FROM my_picture;
>
> It will allocate a largeobject descriptor for the given BLOB data,
> and user can read and write using loread() and lowrite() interfaces.
>
> issue:
> In this case, should it hold the relation handler and locks on the
> "my_picture" relation, not only its toast relation?
> issue:
> Should the lo_open() with read-only mode be available on the existing
> TEXT or BYTEA types? I could not find any reason to deny them.
>
> Next, pg_largeobject system catalog can be redefined using the BLOB
> type as follows:
>
> CATALOG(pg_largeobject,2613)
> {
> Oid loowner; /* OID of the largeobject owner */
> Oid lonsp; /* OID of the largeobject namespace */
> aclitem loacl[1]; /* access permissions */
> blob lodata; /* contents of the largeobject */
> } FormData_pg_largeobject;
>
> The existing largeobject interfaces perform on pg_largeobject.lodata
> specified by largeobject identifier.
> Rest of metadata can be used for access control purpose.
>
> Thanks,
>
> KaiGai Kohei wrote:
>> Tom Lane wrote:
>>> Bernd Helmle <mailings(at)oopsware(dot)de> writes:
>>>> It might be interesting to dig into your proposal deeper in conjunction
>>>> with TOAST (you've already mentioned this TODO). Having serial access with
>>>> a nice interface into TOAST would be eliminating the need for
>>>> pg_largeobject completely (i'm not a big fan of this one-big-system-table
>>>> approach the old LO interface currently is).
>>> Yeah, it would be more useful probably to fix that than to add
>>> decoration to the LO facility. Making LO more usable is just going to
>>> encourage people to bump into its other limitations (32-bit OIDs,
>>> 32-bit object size, finite maximum size of pg_largeobject, lack of
>>> dead-object cleanup, etc etc).
>> The reason why I tried to mention the named largeobject feature is
>> that dac security checks on largeobject require them to belong to
>> a certain schema, so I thought it is quite natural to have a string
>> name. However, obviously, it is not a significant theme for me.
>>
>> I can also agree your opinion that largeobject interfaces should be
>> redefined to access partial stuff of TOAST'ed verlena data structure,
>> not only pg_largeobject.
>>
>> In this case, we will need a new pg_type.typstorage option which
>> force to put the given verlena data on external relation without
>> compression, because we cannot estimate the data offset in inlined
>> or compressed external verlena data.
>>
>> I'll try to submit a design within a few days.
>> Thanks,
>
>

--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2009-07-02 00:17:42 Re: pg_migrator versus inherited columns
Previous Message Robert Haas 2009-07-01 23:45:38 Re: 8.5 development schedule