Query performance

From: "Dirschel, Steve-CW" <Steve(dot)Dirschel(at)bestbuy(dot)com>
To: "pgsql-performance(at)lists(dot)postgresql(dot)org" <pgsql-performance(at)lists(dot)postgresql(dot)org>
Subject: Query performance
Date: 2026-04-27 15:52:06
Message-ID: BYAPR06MB447199DE1FCF363EF389EF1A8A362@BYAPR06MB4471.namprd06.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Aurora Postgres version 17.4.

Table in question:

\d poslog_publisher_rms_stage
Table "public.poslog_publisher_rms_stage"
Column | Type | Collation | Nullable | Default
-------------------+--------------------------+-----------+----------+---------
stage_id | uuid | | not null |
status | character varying(100) | | |
message_body | text | | not null |
error_code | character varying(100) | | |
error_category | character varying(100) | | |
error_message | text | | |
error_retry_count | integer | | | 0
create_date | timestamp with time zone | | not null | now()
modified_date | timestamp with time zone | | not null | now()
Indexes:
"poslog_publisher_rms_stage_pkey" PRIMARY KEY, btree (stage_id)
"idx_poslog_publisher_stage_create_date_col" btree (create_date)
"idx_poslog_publisher_stage_status_error_retry_count_modi_date_c" btree (status, error_retry_count, modified_date)
Referenced by:
TABLE "poslog_publisher_rms_detail" CONSTRAINT "fk_poslog_publisher_detail_stage_id" FOREIGN KEY (stage_id) REFERENCES poslog_publisher_rms_stage(stage_id)
Publications:
“sashpsrms_publication"

The table is constantly getting loaded into. Rows are inserted with a status ready and then there is a different process looking for that status and will update to processed after processing the row. We have multiple tables like this.

Then every 2 hours a different process runs this query looking for failed or unprocessed rows:

select
ppse.stage_id as stageId,
ppse.status as status,
ppse.message_body as messageBody
from
poslog_publisher_rms_stage ppse
where
ppse.status in ('UNPROCESSED','FAILED')
and ppse.error_retry_count < 3
order by
ppse.create_date
limit 100;

If I run that query with explain it is doing a ton of work to find 0 rows. The index it uses seems appropriate for the query.

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=10.92..10.93 rows=1 width=1019) (actual time=66566.823..66566.824 rows=0 loops=1)
Buffers: shared hit=1509768 read=2011479
I/O Timings: shared read=79792.017
-> Sort (cost=10.92..10.93 rows=1 width=1019) (actual time=66566.821..66566.821 rows=0 loops=1)
Sort Key: create_date
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=1509768 read=2011479
I/O Timings: shared read=79792.017
-> Index Scan using idx_poslog_publisher_stage_status_error_retry_count_modi_date_c on poslog_publisher_rms_stage ppse (cost=0.57..10.91 rows=1 width=1019) (actual time=66566.761..66566.761 rows=0 loops=1)
Index Cond: (((status)::text = ANY ('{UNPROCESSED,FAILED}'::text[])) AND (error_retry_count < 3))
Buffers: shared hit=1509765 read=2011479
I/O Timings: shared read=79792.017
Planning:
Buffers: shared hit=195 read=1
I/O Timings: shared read=1.038
Planning Time: 2.909 ms
Execution Time: 66581.498 ms

The query did 3.5 million block reads when scanning the index of which 1.5 million were in memory and 2 million were from disk. 5 seconds later I ran the exact same query again:

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=10.92..10.93 rows=1 width=1019) (actual time=23.589..23.591 rows=0 loops=1)
Buffers: shared hit=18736
-> Sort (cost=10.92..10.93 rows=1 width=1019) (actual time=23.588..23.589 rows=0 loops=1)
Sort Key: create_date
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=18736
-> Index Scan using idx_poslog_publisher_stage_status_error_retry_count_modi_date_c on poslog_publisher_rms_stage ppse (cost=0.57..10.91 rows=1 width=1019) (actual time=23.583..23.584 rows=0 loops=1)
Index Cond: (((status)::text = ANY ('{UNPROCESSED,FAILED}'::text[])) AND (error_retry_count < 3))
Buffers: shared hit=18736
Planning Time: 0.118 ms
Execution Time: 23.628 ms

Now it only did 18k block reads all in memory. It used the same index, it also returned 0 rows. Between those 2 runs I looked at pg_stat_user_tables and could see the n_tup_ins increased by 13, n_tup_del increased by 13, n_live_tup increased by 13, and n_dead_tup increased by 13. n_live_tup was 96 million and n_dead_tup was 8.4 million. 18k logical reads to find 0 rows is still high but I believe that is most likely caused by the 8.4 million n_dead_tups.

15 minutes later I ran the query again:

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=10.92..10.93 rows=1 width=1019) (actual time=59431.795..59431.796 rows=0 loops=1)
Buffers: shared hit=1286676 read=1734000
I/O Timings: shared read=70153.578
-> Sort (cost=10.92..10.93 rows=1 width=1019) (actual time=59431.794..59431.794 rows=0 loops=1)
Sort Key: create_date
Sort Method: quicksort Memory: 25kB
Buffers: shared hit=1286676 read=1734000
I/O Timings: shared read=70153.578
-> Index Scan using idx_poslog_publisher_stage_status_error_retry_count_modi_date_c on poslog_publisher_rms_stage ppse (cost=0.57..10.91 rows=1 width=1019) (actual time=59431.789..59431.790 rows=0 loops=1)
Index Cond: (((status)::text = ANY ('{UNPROCESSED,FAILED}'::text[])) AND (error_retry_count < 3))
Buffers: shared hit=1286676 read=1734000
I/O Timings: shared read=70153.578
Planning Time: 0.114 ms
Execution Time: 59431.839 ms

Total blocks reads increased from 18k to 3 million. n_tup_ins, n_tup_del, n_live_tup, and n_dead_tup all increased by 10,320 over that 15 minute period of time.

What is going on here where this query has to do 3+ million block reads to find 0 rows? And how is it possible when I run the query 2 times in a row the logical reads from the 2nd run comes down significantly? Is this somehow related to determining if rows are visible or something like that? When I waited 15 minutes between runs the inserted/updated rows only increased by 10.3k yet total block reads increased by 3 million.

Thanks

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Laurenz Albe 2026-04-28 05:33:24 Re: Query performance
Previous Message Merlin Moncure 2026-04-21 17:22:45 feature request: index supported REINDEX for partial indexes on needle/haystack tables