Quick Links

Re: sandboxing untrusted code

From:	Jacob Champion <jacob(dot)champion(at)enterprisedb(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Noah Misch <noah(at)leadboat(dot)com>
Subject:	Re: sandboxing untrusted code
Date:	2026-06-01 15:46:52
Message-ID:	CAOYmi+nqsNcAC9=ZocQ+i_zTe55+-xWLD2-Nuu2E7=HDPWdruA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Thu, May 28, 2026 at 11:29 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> My core design idea is to pass around a Provenances object. Provenance
> means the ultimate origin of something and can be used, for example
> when discussing museum objects, to discuss the chain of custody.

I really like this idea in the abstract. No idea if I'll like the
code, but I think tracking the causes of all the parts of an operation
is a really good way forward.

> - There's also decent number of places where provenance traces through
> operator classes and operator families. I haven't sorted all of this
> out yet. It would be defensible to treat these as fully-trusted
> infrastructure because they can only be owned by superusers or
> ex-superusers, but the "ex" might be an important caveat.

I think it'd be nice to develop a definition of "fully-trusted
infrastructure" so we can reason about and expand upon it.

For example, you and I have discussed the concept of "purity" (in the
mathematical sense), which might be a more powerful concept than
LEAKPROOF or IMMUTABLE while still coexisting nicely with them. A DBA
might reasonably decide that, for some sufficiently strong definition
of pure, it doesn't matter what the provenance is.

> This is all very much work-in-progress, so if you have concerns,
> criticisms, or suggestions, I'd rather hear them now than in six
> months.

As you already know, but for the benefit of others reading along, I'm
slowly working on a capability model for the Postgres internals. It
may go absolutely nowhere. But one of the core ideas is, if you don't
have the ability to access a table or call a function yourself,
writing e.g. a trigger to do those things does nothing. You can't
grant the trigger that capability, because you don't have it, so a
superuser would just error out when invoking it.

Even if a model like that were 100% perfect, I think I'd *still* want
provenance tracking. One reason is that DBAs would need to figure out
their current state of affairs, with all the historical baggage
they've built up, in order to migrate to a (much stricter) capability
model. Another reason is that deciding whether and when capabilities
can be transferred across trust boundaries is one of the many
devils-in-the-details, so tracking the chain of custody to identify
those boundaries seems really important.

</small tangent>

All this to say: I like it.

Thanks,
--Jacob

In response to

Re: sandboxing untrusted code at 2026-05-28 18:29:27 from Robert Haas

Responses

Re: sandboxing untrusted code at 2026-06-01 18:53:17 from Robert Haas

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	jian he	2026-06-01 15:51:54	Re: Fix bug of CHECK constraint enforceability recursion
Previous Message	Tomas Vondra	2026-06-01 15:35:16	Re: should we have a fast-path planning for OLTP starjoins?