From: | PG Bug reporting form <noreply(at)postgresql(dot)org> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Cc: | jnasby(at)upgrade(dot)com |
Subject: | BUG #18679: Planner issue with bitmap scan recheck on external TOAST |
Date: | 2024-10-30 23:02:57 |
Message-ID: | 18679-7ef0484db827d6e3@postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 18679
Logged by: Jim Nasby
Email address: jnasby(at)upgrade(dot)com
PostgreSQL version: 16.4
Operating system: RDS
Description:
I’ve been testing use of a BRIN index on record creation date (timestamptz)
on a snapshot of a production system. Note that after creating the BRIN
index the number of buffers being accessed jumps from 23838 to 191663. Based
on what EXPLAIN is showing, I believe the issue is that the planner doesn’t
understand that each additional row that goes through the
(repsrv_account_ids(data) && …) recheck results in fetching at least one
TOAST chunk. (I’d like to know if my assumption about TOAST is correct here;
it’s the only thing I can think of to explain these block numbers from the
2nd EXPLAIN…)
Heap Blocks: exact=11024
Buffers: shared hit=191663
-> BitmapAnd (cost=4903.00..4903.00 rows=14930 width=0) (actual
time=74.704..74.705 rows=0 loops=1)
Buffers: shared hit=1926
Unfortunately I haven’t been able to create an independent repo of this
issue, so this report is based on PG 16 (most recent I can test in
production). repsrv_account_ids() is a function that extracts a field from a
JSONB document (the data field). create_date is timestamptz. There’s 17
other fields in the table that I’m omitting (I can share if needed, but
would need to talk to some folks over here about it).
Indexes:
"task_execution_pkey" PRIMARY KEY, btree (id)
"task_execution__create_date_brin" brin (create_date)
"task_execution_create_date_idx" btree (create_date)
explain (analyze,buffers) SELECT 1
FROM task_execution te
WHERE
(te.create_date BETWEEN '2024-7-1'::timestamptz AND '2024-9-5'::timestamptz)
and
repsrv_account_ids(te.data) &&
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[]
;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------
Gather (cost=1236.73..103675.01 rows=11291 width=4) (actual
time=11.356..41.246 rows=9303 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=23838
-> Parallel Bitmap Heap Scan on task_execution te
(cost=236.73..101545.91 rows=4705 width=4) (actual time=6.659..34.198
rows=3101 loops=3)
Recheck Cond: (repsrv_account_ids(data) &&
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[])
Filter: ((create_date >= '2024-07-01 00:00:00+00'::timestamp with
time zone) AND (create_date <= '2024-09-05 00:00:00+00'::timestamp with time
zone))
Rows Removed by Filter: 5638
Heap Blocks: exact=14066
Buffers: shared hit=23838
-> Bitmap Index Scan on task_execution__account_ids
(cost=0.00..233.91 rows=26469 width=0) (actual time=7.304..7.304 rows=26218
loops=1)
Index Cond: (repsrv_account_ids(data) &&
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[])
Buffers: shared hit=32
Planning:
Buffers: shared hit=1
Planning Time: 0.188 ms
Execution Time: 41.791 ms
(17 rows)
CREATE INDEX task_execution__create_date_brin ON task_execution USING brin
(create_date) WITH (pages_per_range=8);
CREATE INDEX
explain (analyze,buffers) SELECT 1
FROM task_execution te
WHERE
(te.create_date BETWEEN '2024-7-1'::timestamptz AND '2024-9-5'::timestamptz)
and
repsrv_account_ids(te.data) &&
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[]
;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on task_execution te (cost=4903.00..65471.86 rows=11291
width=4) (actual time=76.688..774.362 rows=9303 loops=1)
Recheck Cond: ((repsrv_account_ids(data) &&
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[]) AND
(create_date >= '2024-07-01 00:00:00+00'::timestamp with time zone) AND
(create_date <= '2024-09-05 00:00:00+00'::timestamp with time zone))
Rows Removed by Index Recheck: 2895
Heap Blocks: exact=11024
Buffers: shared hit=191663
-> BitmapAnd (cost=4903.00..4903.00 rows=14930 width=0) (actual
time=74.704..74.705 rows=0 loops=1)
Buffers: shared hit=1926
-> Bitmap Index Scan on task_execution__account_ids
(cost=0.00..233.91 rows=26469 width=0) (actual time=5.103..5.103 rows=26218
loops=1)
Index Cond: (repsrv_account_ids(data) &&
'{3000003,4000003,5000003,3000001,5000006,8805604}'::text[])
Buffers: shared hit=32
-> Bitmap Index Scan on task_execution__create_date_brin
(cost=0.00..4663.20 rows=1461738 width=0) (actual time=68.380..68.380
rows=1999200 loops=1)
Index Cond: ((create_date >= '2024-07-01
00:00:00+00'::timestamp with time zone) AND (create_date <= '2024-09-05
00:00:00+00'::timestamp with time zone))
Buffers: shared hit=1894
Planning:
Buffers: shared hit=19
Planning Time: 1.519 ms
Execution Time: 775.039 ms
(17 rows)
SELECT :stat_fields FROM pg_stats WHERE tablename='task_execution' AND
schemaname='copy' AND attname in ('create_date', 'data');
schemaname | tablename | attname | null_frac | avg_width |
n_distinct | correlation
------------+----------------+-------------+-----------+-----------+-------------+-------------
copy | task_execution | create_date | 0 | 8 |
-0.72454053 | 0.57420146
copy | task_execution | data | 0 | 67 |
-1 | 0.75746745
(2 rows)
SELECT relpages, reltuples FROM pg_class WHERE oid =
'copy.task_execution'::regclass;
relpages | reltuples
----------+--------------
340251 | 2.601468e+06
(1 row)
From | Date | Subject | |
---|---|---|---|
Next Message | Maciej Jaros | 2024-10-31 12:21:54 | Re: BUG #18675: Postgres is not realasing memory causing OOM |
Previous Message | Erik Wienhold | 2024-10-30 22:54:12 | Re: BUG #18678: While installing the PostgreSQL wizard, it throws an error for an non found column |