From: | Marcelo Zabani <mzabani(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Partition pruning with array-contains check and current_setting function |
Date: | 2024-09-11 19:11:49 |
Message-ID: | CACgY3QY6bO46zhPb9fa_c=qi3_OLQ7t4eyZcAd8Mkd9yyPpxXA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
I managed to get a plan I was hoping for, but it still doesn't prune
partitions. I created a new operator #|<(integer[], integer) that is
defined in SQL and is basically equivalent to value=ANY(array), and a
non-stable tenants() function defined that returns an array from the
setting, and with that I could use a scalar subquery without running into
type-checking errors. This gives me an InitPlan node:
=> SET my.tenant_id='{1}';EXPLAIN (COSTS OFF) SELECT COUNT(*) FROM tbl
WHERE tenant_id #|< (select tenants());
SET
QUERY PLAN
----------------------------------------------------------
Finalize Aggregate
InitPlan 1 (returns $0)
-> Result
-> Gather
Workers Planned: 2
Params Evaluated: $0
-> Partial Aggregate
-> Parallel Append
-> Parallel Seq Scan on tbl2 tbl_2
Filter: (tenant_id = ANY ($0))
-> Parallel Seq Scan on tbl1 tbl_1
Filter: (tenant_id = ANY ($0))
It still doesn't prune even if I EXPLAIN ANALYZE it. I thought maybe I did
something wrong with the operator definition, so I tried making tenants()
immutable and removing the scalar subquery, and then it does prune:
=> SET my.tenant_id='{1}';EXPLAIN (COSTS OFF) SELECT COUNT(*) FROM tbl
WHERE tenant_id #|< tenants();
SET
QUERY PLAN
------------------------------------------------------
Aggregate
-> Seq Scan on tbl1 tbl
Filter: (tenant_id = ANY ('{1}'::integer[]))
Sadly I can't make tenants() immutable because it's a runtime setting, and
making tenants() STABLE does not lead to partition pruning with or without
the scalar subquery around it.
I'm a bit lost. It seems like postgres is fully capable of pruning
partitions for =ANY checks, and some strange detail is confusing it in this
case. I'm not sure what else to try.
On Wed, Aug 7, 2024 at 6:10 PM Marcelo Zabani <mzabani(at)gmail(dot)com> wrote:
> Hello all. I am trying to make postgres 16 prune partition for queries
> with `WHERE tenant_id=ANY(current_setting('my.tenant_id')::integer[])`, but
> I haven't been able to make it work, and naturally it impacts performance
> so I thought this list would be appropriate.
>
> Here's the SQL I tried (but feel free to skip to the end as I'm sure all
> this stuff is obvious to you!):
>
>
>
>
>
>
>
>
> *CREATE TABLE tbl (id SERIAL NOT NULL, tenant_id INT NOT NULL, some_col
> INT, PRIMARY KEY (tenant_id, id)) PARTITION BY HASH (tenant_id);CREATE
> TABLE tbl1 PARTITION OF tbl FOR VALUES WITH (MODULUS 2, REMAINDER 0);CREATE
> TABLE tbl2 PARTITION OF tbl FOR VALUES WITH (MODULUS 2, REMAINDER 1);INSERT
> INTO tbl (tenant_id, some_col) SELECT 1, * FROM
> generate_series(1,10000);INSERT INTO tbl (tenant_id, some_col) SELECT 3, *
> FROM generate_series(1,10000);*
>
> Partition pruning works as expected for this query (still not an
> array-contains check):
> *EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE tenant_id=1;*
>
> When reading from a setting it also prunes partitions correctly:
>
> *SET my.tenant_id=1;EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE
> tenant_id=current_setting('my.tenant_id')::integer;*
>
> It still does partition pruning if we use a scalar subquery. I can see the
> (never executed) scans in the plan.
> *EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE tenant_id=(SELECT
> current_setting('my.tenant_id')::integer);*
>
> But how about an array-contains check? Still prunes, which is nice.
> *EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE
> tenant_id=ANY('{1}'::integer[]);*
>
> However, it doesn't prune if the array is in a setting:
>
> *SET my.tenant_id='{1}';EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE
> tenant_id=ANY(current_setting('my.tenant_id')::integer[]);*
>
> I actually expected that when in a setting, none of the previous queries
> would've done partition pruning because I thought `current_setting` is not
> a stable function. But some of them did, which surprised me.
>
> So I thought maybe if I put it in a scalar query it will give me an
> InitPlan node, but it looks like method resolution for =ANY won't let me
> try this:
> *EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE tenant_id=ANY((SELECT
> current_setting('my.tenant_id')::integer[]));*
> *ERROR: operator does not exist: integer = integer[]*
>
> I tried using UNNEST, but that adds a Hash Semi Join to the plan which
> also doesn't do partition pruning.
> *EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE tenant_id=ANY((SELECT
> UNNEST(current_setting('my.tenant_id')::integer[])));*
>
> My question is if there's a way to do partition pruning based on
> array-contains operator if the array is in a setting. The use-case is to
> make Row Level Security policies do partition pruning "automatically" in a
> setting where users can be in more than one tenant.
> It feels like this would work if there were a non-overloaded operator that
> takes in an array and a single element and tests for array-contains,
> because then I could use that operator with a scalar subquery and get an
> InitPlan node. But I'm new to all of this, so apologies if I'm getting it
> all wrong!
>
> Thanks in advance,
> Marcelo.
>
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2024-09-12 05:15:34 | Re: proposal: schema variables |
Previous Message | David Mullineux | 2024-09-11 10:47:41 | Re: Has gen_random_uuid() gotten much slower in v17? |