Re: pg_dump needs SELECT privileges on irrelevant extension table

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Jacob Champion <jchampion(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: pg_dump needs SELECT privileges on irrelevant extension table
Date: 2023-07-14 12:04:18
Message-ID: a185f66f-1edc-426e-41a4-091dad230c75@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers


On 2023-06-29 Th 12:24, Jacob Champion wrote:
> On 3/20/23 10:43, Tom Lane wrote:
>> I'd be more willing to consider the proposed patch if it weren't such
>> a hack --- as you say, it doesn't fix the problem when the table has
>> policies, so it's hardly a general-purpose solution. I fear that it's
>> also fairly expensive: adding sub-selects to the query we must do
>> before we can lock any tables is not appetizing, because making that
>> window wider adds to the risk of deadlocks, dump failures, etc.
> (moving to -hackers and July CF)
>
> = Recap for Potential Reviewers =
>
> The timescaledb extension has an internal table that's owned by the
> superuser. It's not dumpable, and other users can only access its
> contents through a filtered view. For our cloud deployments, we
> additionally have that common trope where the most privileged users
> aren't actually superusers, but we still want them to be able to perform
> a subset of maintenance tasks, including pg_dumping their data.
>
> When cloud users try to dump that data, pg_dump sees that this internal
> table is an extension member and plans to dump ACLs, security labels,
> and RLS policies for it. (This behavior cannot be overridden with
> --exclude-table. pg_dump ignores that flag for extension members.)
> Dumping policies requires the use of pg_get_expr() on the backend; this,
> in turn, requires a lock on the table with ACCESS SHARE.
>
> So pg_dump tries to lock a table, with no policies, that it's not going
> to dump the schema or data for anyway, and it fails because our users
> don't have (and shouldn't need) SELECT access to it. For an example of
> this in action, I've attached a test case in v2-0001.
>
> = Proposal =
>
> Since this is affecting users on released Postgres versions, my end goal
> is to find a fix that's backportable.
>
> This situation looks very similar to [1], where non-superusers couldn't
> perform a dump because we were eagerly grabbing table locks to read
> (non-existent) ACLs. But that was solved with the realization that ACLs
> don't need locks anyway, which is unfortunately not applicable to policies.
>
> My initial patch to -bugs was a riff on a related performance fix [2],
> which figured out which tables had interesting ACLs and skipped that
> part if nothing was found. I added the same kind of subselect for RLS
> policies as well, but that had nasty corner cases where it would perform
> terribly, as Tom alluded to above. (In a cluster of 200k tables, where
> one single table had 10M policies, the query ground to a halt.)
>
> So v2-0002 is instead inspired by Tom's rewrite of that ACL dump logic
> [3]. It scans pg_policy separately, stores the tables it finds into the
> catalog map on the client side, and then again skips the policy dump
> (and therefore the lock) if no policies exist. The performance hit now
> scales with the size of pg_policy alone.
>
> This is a bit more invasive than the subselect, but hopefully still
> straightforward enough to be applicable to the back branches' old
> catalog map strategy. It's still not a general-purpose fix, as Tom
> pointed out above, but that was true of the discussion in [1] as well,
> so I'm optimistic.
>
> WDYT?
>
> --Jacob
>
> [1]
> https://postgr.es/m/CAGPqQf3Uzo-yU1suYyoZR83h6QTxXxkGTtEyeMV7EAVBqn%3DPcQ%40mail.gmail.com
> [2]https://git.postgresql.org/cgit/postgresql.git/commit/?id=5d589993
> [3]https://git.postgresql.org/cgit/postgresql.git/commit/?id=0c9d8442

Seems reasonable at first glance. Isn't it going to save some work
anyway later on, so the performance hit could end up negative?

cheers

andrew

--
Andrew Dunstan
EDB:https://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Artem Anisimov 2023-07-14 13:05:51 Re: BUG #17949: Adding an index introduces serialisation anomalies.
Previous Message Akshat Jaimini 2023-07-14 07:45:53 Re: pg_dump needs SELECT privileges on irrelevant extension table

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Pyhalov 2023-07-14 13:40:16 Re: Partial aggregates pushdown
Previous Message Yugo NAGATA 2023-07-14 11:32:01 Re: pgbnech: allow to cancel queries during benchmark