Re: Largeobject Access Controls (r2460)

From: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Takahiro Itagaki <itagaki(dot)takahiro(at)oss(dot)ntt(dot)co(dot)jp>
Cc: KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)commandprompt(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jaime Casanova <jcasanov(at)systemguards(dot)com(dot)ec>, Greg Smith <greg(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Largeobject Access Controls (r2460)
Date: 2010-02-08 05:17:16
Message-ID: 4B6F9E5C.7060503@ak.jp.nec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2010/02/05 13:53), Takahiro Itagaki wrote:
>
> KaiGai Kohei<kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>
>>> default: both contents and metadata
>>> --data-only: same
>>> --schema-only: neither
>>
>> However, it means only large object performs an exceptional object class
>> that dumps its owner, acl and comment even if --data-only is given.
>> Is it really what you suggested, isn't it?
>
> I wonder we still need to have both "BLOB ITEM" and "BLOB DATA"
> even if we will take the all-or-nothing behavior. Can we handle
> BLOB's owner, acl, comment and data with one entry kind?

The attached patch was a refactored one according to the suggestion.

In the default or --data-only, it dumps data contents of large objects
and its properties (owner, comment and access privileges), but it dumps
nothing when --schema-only is given.

default: both contents and metadata
--data-only: same
--schema-only: neither

It replaced existing "BLOBS" and "BLOB COMMENTS" sections by the new
"LARGE OBJECT" section which is associated with a certain large object.
Its section header contains OID of the large object to be restored, so
the pg_restore tries to load the specified large object from the given
archive.

_PrintTocData() handlers were modified to support the "LARGE OBJECT"
section that loads the specified large object only, not whole of the
archived ones like "BLOBS". It also support to read "BLOBS" and "BLOB
COMMENTS" sections, but never write out these legacy sections any more.

The archive file will never contain "blobs.toc", because we can find
OID of the large objects to be restored in the section header, without
any special purpose files. It also allows to omit _StartBlobs() and
_EndBlobs() method in tar and file format.

Basically, I like this approach more than the previous combination of
"BLOB ITEM" and "BLOB DATA".

However, we have a known issue here.
The ACL section is categorized to REQ_SCHEMA in _tocEntryRequired(),
so we cannot dump them when --data-only options, even if large object
itself is dumped out. Of course, we can solve it with putting a few more
exceptional treatments, although it is not graceful.
However, it seems to me the matter comes from that _tocEntryRequired()
can only returns a mask of REQ_SCHEMA and REQ_DATA. Right now, it is
not easy to categorize ACL/COMMENT section into either of them.
I think we should consider REQ_ACL and REQ_COMMENT to inform caller
whether the appeared section to be dumped now, or not.

Any idea?

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

Attachment Content-Type Size
pgsql-fix-pg_dump-blob-privs.5.patch application/octect-stream 37.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2010-02-08 06:56:20 Re: Confusion over Python drivers
Previous Message Robert Haas 2010-02-08 04:53:23 Re: [HACKERS] Re: Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)