Re: [0/4] Proposal of SE-PostgreSQL patches

From: KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org, KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>, kaigai(at)ak(dot)jp(dot)nec(dot)com
Subject: Re: [0/4] Proposal of SE-PostgreSQL patches
Date: 2008-05-07 05:52:27
Message-ID: 4821439B.8010006@ak.jp.nec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Tom, Thanks for your reviewing.

Tom Lane wrote:
> KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com> writes:
>> I updated the series of SE-PostgreSQL patches for the latest pgsql-8.4devel tree.
>
> I tried to do a bit of testing of this, but it does not work on current
> Fedora 8, because the policy module doesn't build:
>
> [tgl(at)rh2 sepgsql-policy]$ make
> make -f /usr/share/selinux/devel/Makefile NAME=targeted
> make[1]: Entering directory `/home/tgl/sepgsql/contrib/sepgsql-policy'
> Compiling targeted sepostgresql module
> /usr/bin/checkmodule: loading policy configuration from tmp/sepostgresql.tmp
> sepostgresql.te:349:ERROR 'syntax error' at token 'corenet_tcp_recvfrom_labeled' on line 5675:
> # NOTE: These changes are to be merged in the later releases.
> corenet_tcp_recvfrom_labeled(sepgsql_server_type, sepgsql_client_type)
> /usr/bin/checkmodule: error(s) encountered while parsing configuration
> make[1]: *** [tmp/sepostgresql.mod] Error 1
> make[1]: Leaving directory `/home/tgl/sepgsql/contrib/sepgsql-policy'
> make: *** [sepostgresql.pp] Error 2
> [tgl(at)rh2 sepgsql-policy]$

This policy can be built on Fedora 9 or rawhide now.
The latest selinux-policy on Fedora 8 does not contain labeled networking
interfaces, so it makes errors.

FYI,
I also have a discussion to merge the SE-PostgreSQL security policy module
into the default upstreamed policy (called as reference policy).
http://marc.info/?t=120574668300001&r=1&w=2

I guess it will be merged former than the release date of PostgreSQL v8.4,
so it may not be necessary to include it within contrib/ .

In the next patch set, I like to provide it with different form.

> In the meantime, here are some random comments after my failed test
> build and a very fast scan through the patch:

> The patch tries to re-add ipcclean to the source tree --- probably a
> merge error?

I don't intend to modify ipcclean.
It is a bug in patch generation scripts, to be fixed soon.

> autoconf complains about the description-free AC_DEFINEs

OK, I add the following descriptions for my AC_DEFINEs.

AC_DEFINE(SECURITY_SYSATTR_NAME, "security_context",
[Enables system column for security attribute support])
AC_DEFINE_UNQUOTED(HAVE_SELINUX, 1,
[Enables SE-PostgreSQL feature])

> Doesn't compile warning-free if selinux not enabled ... for that matter,
> it doesn't compile warning-free if selinux IS enabled.

Does it mean you run configure script without --enable-selinux?
I also confirmed some warnings to be fixed in the next patch set.
Or, is selinux disabled on your workstation?

> No documentation updates whatsoever :-(

OK, I'll add a new documentation at doc/src/sgml .
Is it your intention, isn't it?

> About half of the patch seems to be conditional on
> #ifdef SECURITY_SYSATTR_NAME
> and the other half on
> #ifdef HAVE_SELINUX
> This seems bizarre: is there really any chance that there are two
> independently usable chunks of code here? And why is it conditional
> on a macro that is a field name, rather than conditional on a feature
> macro? That is, I'd expect to see something like
> #ifdef ENABLE_SEPOSTGRES
> throughout.

SECURITY_SYSATTR_NAME is a name of security attribute system column,
and used to enable/disable this feature provided by PGACE.
I think PGACE has possible users more than SELinux, like Solaris TX.
Applying separated symbols are worthwhile when a new security module
requires security attribute but it is not SELinux.

> For that matter, what is the point of treating SECURITY_SYSATTR_NAME
> as a configurable thing in the first place? I can hardly imagine a
> worse idea than a security-critical column having different names in
> different installations.

SECURITY_SYSATTR_NAME is defined as "security_context" in SE-PostgreSQL.
However, it is a special term of SELinux.
If we can agree this name, we don't have to abstract the name of security
attribute system column.

> The patch hasn't got a mode in which SELinux support is compiled in but
> not active. This is a good way to ensure that no one will ever ship
> standard RPMs with the feature compiled in, because they will be entirely
> nonfunctional for people who aren't interested in setting up SELinux.
> I think you need an "enable_sepostgres" GUC, or something like that.
> (Of course, the overhead of the per-row security column would probably
> discourage anyone from wanting to use such a configuration anyway,
> so maybe the point is moot.)

We can turn on/off SELinux globally, not bounded to SE-PostgreSQL.
The reason why I didn't provide a mode bit like "enable_sepostgresql"
is to keep consistency in system configuration.
It is undesirable situation to disable SELinux on a partial inter-process
communication channels.

BTW, what does it mean the overhead of the per-row security column?
When SELinux is disabled, it is not labeled and does not consume any
additional spaces.

> sepgsql-policy has got usability problems:
> * It should pay attention to the configured installation PREFIX instead of
> hardwiring a couple of random possible installation locations
> * It can only support the build machine's SELINUXTYPE --- how am I supposed
> to produce RPMs that support all available types?

As noted above, I'll provide its security policy in a different way.
The hardcoded parameters are also changed to configurable ones.

> The contents and use of sepgsqlGetTupleName() make it look like the
> entire security scheme is based on object name alone. That doesn't
> even account for schemas, let alone overloaded function names. Please
> tell me this is not as broken-by-design as it looks.

The result of sepgsqlGetTupleName() is used to generate access denied
logs as a name of target object, not used to access controls.

> I occasionally tell people "try to make the patch look like it's always
> been there". This is pretty far from meeting that goal. Random bits
> of code that are commented "PGACE:" are obviously not intended to just
> fit in. You've generally ignored the Postgres code layout conventions
> (pgindent will help to some extent but it's far from a panacea) and our
> commenting conventions --- eg, hardly any of the functions have header
> comments, and the ones that do follow conventions seen noplace in the
> Postgres code, like using "@" on parameter names. In general the number
> and quality of the comments is far below the standard for Postgres code,
> and the lack of any implementation documentation isn't helping.

OK, I'll improvement coding style, comments and so on.

> Another big problem, which I understand your motivation for but that
> doesn't make the code any less ugly, is that you've got trivial bits
> of code that're separated by two(!) levels of hook calls from where
> they're actually being used. Not only does that make it unreadable
> but the files that actually do the work combine bits of code that
> should be scattered across a lot of modules, causing those files to
> be just horrid from a modularity standpoint --- they've got their
> fingers stuck in everyplace. If you want this to get applied you need
> to start thinking of it as an integral part of the code, not an add-on.

I thought small patches help reviewer check it comfortably.
However, it is unwelcome style...
I'll update the next patch without separating both features.

> Some other bits of add-on-itis:
> * If you need a dedicated LWLock, declare it in lwlock.h.
> * If you need a node type, declare it in nodes.h (T_SEvalItem is utterly
> broken)

OK, I'll move them.

> Why in the world would you have security restrictions associated with
> TOAST tuples? Seems like all the interesting restrictions should be
> on the parent table.

sepgsqlHeapTuple*() hooks are also invoked for tuples within TOAST tables,
but it does not apply any restriction. When given tuple is belong to
TOAST tables, SE-PostgreSQL skips its checks.

> Don't randomly invent your own style of management of a postmaster
> child process. For one thing, this code doesn't cope with either
> unexpected death of the postmaster or unexpected death of the child.
> If you need another child, manage it in postmaster.c the same way
> the other children are managed.

OK,

> The code in hooks.c looks suspiciously not-HOT-aware, eg use of
> ItemIdIsUsed() for what probably needs to be ItemIdIsNormal().
> (Not that this code ought to be fetching the tuple for itself
> in the first place --- probable big performance loss there...)

Does GetTupleForTrigger() help me as a sample to obtain the older
version of tuple on the hooks of simaple_heap_update()?

> pgaceHooks.c seems to be a useless layer of indirection
> --- lose it all, and inline into callers instead.

OK, I'll push them into inline function and remove pgaceHooks.c.

> Is the hard-wired shmem cache size really adequate? Why are you
> using such a cache in shared memory at all, rather than backend-local?
> The locking implications likely take away more performance than you
> save by not having each session need to load up its cache.
> (We don't use shared catalog caches, in general.)

I think caches of security policy should be placed on shared memory.
If backend held it locally, it requires some system call invokations
per connection at least, although the security policy is seldom reloaded.
However, we have to invalidate the cache when it is reloaded.
If every backend have its cache, cache invalidation will be a uneasy task.

> If we're going to support assignment to system columns, we probably
> want a general solution that will work for OID not only the security
> column. Also, I'm unconvinced that setting resjunk = true for such
> targetlist entries is a good idea.

OK, it is possible to expand.
However, why are you unconvinced for setting resjunk = true, when users
try to update system column? I think it is a reasonable way to compute
right hand of terms, like:
UPDATE foo SET security_context = security_context::text || ':Classified' ...

> The whole "early security" business looks like a mess :-(. I suspect
> you should rip all that out of the backend and add a step to initdb
> that fills in those tables.

I also think "early security" codes are ad-hoc. :-(
Pushing it into initdb seems me a good idea.
I'll try to consider whether it is possible, or not.

> The idea of input functions that alter system tables scares me.

Are you saying about pg_security system catalog?

> elog() should not be used for user-facing errors. I couldn't easily
> tell just which of the messages are likely to be seen by users and
> which ones should be "can't happen" cases, but certainly there are
> a whole lot of these that need to be ereport()s. Likely there need
> to be some new ERRCODEs too.

OK, I'll replace elog() by ereport().
Is adding a new error code (ERRCODE_SECURITY_VIOLATION) in Class XX appropriate?

> __lookupRelationForm() strikes me as a complete crock. It probably
> means that you've selected the wrong places to call its callers from.
> This gets back to the point above that adding additional fetches of
> tuples isn't good for performance anyway.

The purpose of __lookupRelationForm() is to obtain the HeapTuple of
parent relation when new attributes are inserted into pg_attribute
system catalog.
But we cannot refer the tuple of parent table because CommandCounterIncrement()
are not invoked between inserting a tuple into pg_class and pg_attribute.

Is there any good idea to refer a tuple of parent table just before
tuples are inserted into pg_attribute?

> Don't use identifiers with a leading double-underscore. These are
> reserved for system-private identifiers according to the C standard.

OK, I'll fix rest of them.

> Use of a function in genbki.sh is very likely not portable.

OK, I'll consider something

> use of flock() is probably not portable and even less probably necessary.

OK, I'll fix it.

> Declaring a function foo() rather than foo(void) is poor style ---
> at least some compilers will complain about that.

OK, I'll fix these kind of function declarations.

> I see a lot of "Copyright 2007 KaiGai Kohei" notices. Are you
> willing to assign those copyrights to the Postgres Global Development
> Group? If not, we'll at least need statements along the line of
> "Distributed under the PostgreSQL license".

No need to say, I'm willing to assign these copytights to the PostgreSQL
Global Development Group.
Can I replace them on the next patch set?

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alex Hunsaker 2008-05-07 06:20:37 Re: [badalex@gmail.com: Re: [BUGS] Problem identifying constraints which should not be inherited]
Previous Message Tom Lane 2008-05-07 05:52:10 Re: alter + preserving dependencies

Browse pgsql-patches by date

  From Date Subject
Next Message Alex Hunsaker 2008-05-07 06:20:37 Re: [badalex@gmail.com: Re: [BUGS] Problem identifying constraints which should not be inherited]
Previous Message Bruce Momjian 2008-05-07 04:35:06 Re: [HACKERS] Re: a tsearch2 (8.2.4) dictionary that only filters out stopwords