BUG #19015: insert-returning failure: stable function in select policy doesn't see before-insert trigger change

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: di(at)nmfay(dot)com
Subject: BUG #19015: insert-returning failure: stable function in select policy doesn't see before-insert trigger change
Date: 2025-08-07 19:56:35
Message-ID: 19015-361c9035ff9d47d0@postgresql.org
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 19015
Logged by: Dian Fay
Email address: di(at)nmfay(dot)com
PostgreSQL version: 18beta2
Operating system: Linux x64
Description:

This script is a minimal example of a system that uses row-level security
with an access-control list table and a policy on an RLS-enabled table
invoking a function to check the user's ACL on read. The intent is for
users' visibility on records in the `test` table to be restricted to those
for which they have an ACL entry; however, any user should be able to
_insert_ a new `test` record, and an ACL entry is automatically generated by
a `before insert` trigger.

There is a surprising interaction between function volatility and the use of
`returning`. Specifically, _if_ the `can_read` function in the `for select`
policy is marked `stable`, _and_ an insert into `test` has a `returning`
clause (included at the end of the following script), the server declares
that the new row violates an unnamed RLS policy:

> ERROR: new row violates row-level security policy for table "test"

If the insert does not return values, the function can be `stable`; if the
function is `volatile`, the insert with `returning` succeeds.

There's no immediate reason to check ACLs at any point in the transaction
before the new ACL is registered, so `stable` should in theory be the
appropriate volatility level. It's definitely desirable; in this example
we're calling `can_read` with a different value for every tuple, but in the
original system we pass it a much more limited set of parent ids, and
redundant execution adds up quickly.

Following ExecInsert, I see the `before insert` triggers firing before it
checks RLS policies. When it does get to the ExecWithCheckOptions call,
`resultRelInfo` has ri_WithCheckOptions and ri_WithCheckOptionExprs. The
former WCO has no `polname` but is a WCO_RLS_INSERT_CHECK. The expr is a
FuncExpr representing `can_read` from the `for select` policy. Since the
`before` triggers have already run and registered an ACL for this record,
and because this is the first time `can_read` executes, it should see the
ACL entry, but does not unless marked `volatile`.

If I omit the `returning`, ri_WithCheckOptions is nil; I'm guessing the
dead-simple `true` policy for insert was inlined or discarded at some point
(it's there in fireRIRrules), but haven't tracked it down.

Questions:

- the fact that a `volatile` function can see the ACL entry suggests that
it's being executed before the ACL registration trigger and a `stable`
result is cached and reused. `explain` does show an InitPlan that seems a
bit suspect, but stepping into ExecWithCheckOptions -> ExecQual with the
function marked `stable` shows an IndexScan happening. That seems to confirm
that `can_read` is executed only after the trigger inserts an ACL entry. Why
does the stable `can_read` not see it?
- setting the main issue aside, it is already possible to insert records
that a `for select` policy doesn't allow you to see in a subsequent query.
Should a violation of the `for select` policy by a `returning` clause yield
an empty result instead of scuppering the insert to be consistent?

I encountered the problem in 15.12 but did the debugging against master. I'm
interested in hacking on this but am not really sure where to look next --
even accounting for the `for select` policy qual being pulled up into a
WithCheckOption on the insert, it seems like it's testing WCOs after the
trigger.

-- create role "team-member";
drop table if exists test;
drop function if exists can_read;
drop function if exists register_acl;
drop table if exists acl;

create table test (
id int generated always as identity primary key,
num int,
val text
);
create table acl (
item int,
team text,
can_read boolean,
can_write boolean,
primary key (item, team)
);

create function register_acl()
returns trigger
language plpgsql
as $$begin
insert into acl(item, team, can_read, can_write)
values (new.id, current_setting('assumed_role.team'), true, true)
on conflict do nothing;

return new;
end$$;

create trigger test_register_acl before insert on test
for each row execute function register_acl();

alter table test enable row level security;

grant select, insert on test to "team-member";
grant select, insert on acl to "team-member";

create function can_read(_item int, _team text)
returns boolean
language sql
-- as long as the final `test` insert adds `returning`, the `stable`
-- function cannot see acls inserted by the `before insert` trigger,
-- while a `volatile` function can
stable
begin atomic
select exists (
select 1 from acl
where item = _item
and team = _team
and can_read is true
);
end;

create policy test_tm_select on test for select to "team-member"
-- using (
-- (select can_read(id, (select current_setting('assumed_role.team'))))
-- );
using (
can_read(id, (select current_setting('assumed_role.team')))
);

create policy test_tm_insert on test for insert to "team-member"
with check (true);

set role "team-member";
select set_config('assumed_role.team', 'demo', false);

explain
insert into test(num, val) values (1, 'test')
returning id, num, val; -- without `returning`, we can insert with stable
can_read
-- just fine since the select policy never comes into play

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Praxmarer 2025-08-07 20:15:06 Re: BUG: IS NOT NULL on RECORD variable fails in 17.5-dev
Previous Message David G. Johnston 2025-08-07 19:54:07 Re: BUG: IS NOT NULL on RECORD variable fails in 17.5-dev