Re: Query Performance Degradation Due to Partition Scan Order – PostgreSQL v17.6

From: Vivek Gadge <vvkgadge56(at)gmail(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Query Performance Degradation Due to Partition Scan Order – PostgreSQL v17.6
Date: 2025-09-09 13:50:58
Message-ID: CAK+uD7g=AzsedqVo2Q2ZvLs0KR6Xcqj1=_HJxjDLobY7UZ8P_A@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Ashutosh,

Thank you for your feedback regarding this matter.

To provide more context, here is the exact query I am running:

EXPLAIN ANALYZE VERBOSE
SELECT m.txn_date, d.bank_ref
FROM app.main m
JOIN app.detail d ON m.txn_id = d.main_txn_id
WHERE m.txn_id = 9999999999999999999;

Both app.main and app.detail are range partitioned by month (e.g.,
main_202502, detail_202502, etc.) on a date column,

Attached below is the EXPLAIN ANALYZE VERBOSE output for the query. It
shows that PostgreSQL is scanning all partitions in ascending order, even
though the matching record is present in the main_202509 partition.

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------
Nested Loop (cost=1.00..46.19 rows=156 width=346) (actual
time=17.656..18.173 rows=1 loops=1)
Output: m.txn_date, d.bank_ref, m.processor_id, m.txn_code, m.pos_data,
d.issuer_txn_id, m.trans_ref, d.processor_uid, m.sub_txn_type,
d.processor_token
-> Append (cost=0.43..22.05 rows=13 width=238) (actual
time=8.749..9.027 rows=1 loops=1)
-> Index Scan using detail_202502_main_txn_id_key on
app.detail_202502 d_1 (cost=0.43..2.65 rows=1 width=74) (actual
time=1.634..1.634 rows=0 loops=1)
Output: d_1.bank_ref, d_1.issuer_txn_id, d_1.processor_uid,
d_1.processor_token, d_1.main_txn_id
Index Cond: (d_1.main_txn_id = '9999999999999999999'::bigint)
-> Index Scan using detail_202503_main_txn_id_key on
app.detail_202503 d_2 (cost=0.56..2.78 rows=1 width=74) (actual
time=1.158..1.158 rows=0 loops=1)
Output: d_2.bank_ref, d_2.issuer_txn_id, d_2.processor_uid,
d_2.processor_token, d_2.main_txn_id
Index Cond: (d_2.main_txn_id = '9999999999999999999'::bigint)
-> Index Scan using detail_202504_main_txn_id_key on
app.detail_202504 d_3 (cost=0.56..2.78 rows=1 width=74) (actual
time=1.105..1.106 rows=0 loops=1)
Output: d_3.bank_ref, d_3.issuer_txn_id, d_3.processor_uid,
d_3.processor_token, d_3.main_txn_id
Index Cond: (d_3.main_txn_id = '9999999999999999999'::bigint)
-> Index Scan using detail_202505_main_txn_id_key on
app.detail_202505 d_4 (cost=0.56..2.78 rows=1 width=73) (actual
time=1.097..1.097 rows=0 loops=1)
Output: d_4.bank_ref, d_4.issuer_txn_id, d_4.processor_uid,
d_4.processor_token, d_4.main_txn_id
Index Cond: (d_4.main_txn_id = '9999999999999999999'::bigint)
-> Index Scan using detail_202506_main_txn_id_key on
app.detail_202506 d_5 (cost=0.56..2.78 rows=1 width=74) (actual
time=1.002..1.002 rows=0 loops=1)
Output: d_5.bank_ref, d_5.issuer_txn_id, d_5.processor_uid,
d_5.processor_token, d_5.main_txn_id
Index Cond: (d_5.main_txn_id = '9999999999999999999'::bigint)
-> Index Scan using detail_202507_main_txn_id_key on
app.detail_202507 d_6 (cost=0.56..2.78 rows=1 width=74) (actual
time=0.913..0.913 rows=0 loops=1)
Output: d_6.bank_ref, d_6.issuer_txn_id, d_6.processor_uid,
d_6.processor_token, d_6.main_txn_id
Index Cond: (d_6.main_txn_id = '9999999999999999999'::bigint)
-> Index Scan using detail_202508_main_txn_id_key on
app.detail_202508 d_7 (cost=0.56..2.78 rows=1 width=73) (actual
time=0.833..0.833 rows=0 loops=1)
Output: d_7.bank_ref, d_7.issuer_txn_id, d_7.processor_uid,
d_7.processor_token, d_7.main_txn_id
Index Cond: (d_7.main_txn_id = '9999999999999999999'::bigint)
-> Index Scan using detail_202509_main_txn_id_key on
app.detail_202509 d_8 (cost=0.43..2.65 rows=1 width=74) (actual
time=1.001..1.004 rows=1 loops=1)
Output: d_8.bank_ref, d_8.issuer_txn_id, d_8.processor_uid,
d_8.processor_token, d_8.main_txn_id
Index Cond: (d_8.main_txn_id = '9999999999999999999'::bigint)
-> Seq Scan on app.detail_202510 d_9 (cost=0.00..0.00 rows=1
width=500) (actual time=0.041..0.042 rows=0 loops=1)
Output: d_9.bank_ref, d_9.issuer_txn_id, d_9.processor_uid,
d_9.processor_token, d_9.main_txn_id
Filter: (d_9.main_txn_id = '9999999999999999999'::bigint)
-> Seq Scan on app.detail_202511 d_10 (cost=0.00..0.00 rows=1
width=500) (actual time=0.065..0.066 rows=0 loops=1)
Output: d_10.bank_ref, d_10.issuer_txn_id,
d_10.processor_uid, d_10.processor_token, d_10.main_txn_id
Filter: (d_10.main_txn_id = '9999999999999999999'::bigint)
-> Seq Scan on app.detail_202512 d_11 (cost=0.00..0.00 rows=1
width=500) (actual time=0.085..0.085 rows=0 loops=1)
Output: d_11.bank_ref, d_11.issuer_txn_id,
d_11.processor_uid, d_11.processor_token, d_11.main_txn_id
Filter: (d_11.main_txn_id = '9999999999999999999'::bigint)
-> Seq Scan on app.detail_202601 d_12 (cost=0.00..0.00 rows=1
width=500) (actual time=0.039..0.039 rows=0 loops=1)
Output: d_12.bank_ref, d_12.issuer_txn_id,
d_12.processor_uid, d_12.processor_token, d_12.main_txn_id
Filter: (d_12.main_txn_id = '9999999999999999999'::bigint)
-> Seq Scan on app.detail_default d_13 (cost=0.00..0.00 rows=1
width=500) (actual time=0.037..0.037 rows=0 loops=1)
Output: d_13.bank_ref, d_13.issuer_txn_id,
d_13.processor_uid, d_13.processor_token, d_13.main_txn_id
Filter: (d_13.main_txn_id = '9999999999999999999'::bigint)
-> Materialize (cost=0.56..22.23 rows=12 width=125) (actual
time=8.901..9.137 rows=1 loops=1)
Output: m.txn_date, m.processor_id, m.txn_code, m.pos_data,
m.trans_ref, m.sub_txn_type, m.txn_id
-> Append (cost=0.56..22.17 rows=12 width=125) (actual
time=8.892..9.127 rows=1 loops=1)
-> Index Scan using main_202502_pkey on app.main_202502
m_1 (cost=0.56..2.78 rows=1 width=36) (actual time=0.897..0.897 rows=0
loops=1)
Output: m_1.txn_date, m_1.processor_id, m_1.txn_code,
m_1.pos_data, m_1.trans_ref, m_1.sub_txn_type, m_1.txn_id
Index Cond: (m_1.txn_id =
'9999999999999999999'::bigint)
-> Index Scan using main_202503_pkey on app.main_202503
m_2 (cost=0.56..2.78 rows=1 width=37) (actual time=1.105..1.105 rows=0
loops=1)
Output: m_2.txn_date, m_2.processor_id, m_2.txn_code,
m_2.pos_data, m_2.trans_ref, m_2.sub_txn_type, m_2.txn_id
Index Cond: (m_2.txn_id =
'9999999999999999999'::bigint)
-> Index Scan using main_202504_pkey on app.main_202504
m_3 (cost=0.56..2.78 rows=1 width=37) (actual time=1.114..1.114 rows=0
loops=1)
Output: m_3.txn_date, m_3.processor_id, m_3.txn_code,
m_3.pos_data, m_3.trans_ref, m_3.sub_txn_type, m_3.txn_id
Index Cond: (m_3.txn_id =
'9999999999999999999'::bigint)
-> Index Scan using main_202505_pkey on app.main_202505
m_4 (cost=0.56..2.78 rows=1 width=37) (actual time=1.135..1.135 rows=0
loops=1)
Output: m_4.txn_date, m_4.processor_id, m_4.txn_code,
m_4.pos_data, m_4.trans_ref, m_4.sub_txn_type, m_4.txn_id
Index Cond: (m_4.txn_id =
'9999999999999999999'::bigint)
-> Index Scan using main_202506_pkey on app.main_202506
m_5 (cost=0.56..2.78 rows=1 width=38) (actual time=1.096..1.096 rows=0
loops=1)
Output: m_5.txn_date, m_5.processor_id, m_5.txn_code,
m_5.pos_data, m_5.trans_ref, m_5.sub_txn_type, m_5.txn_id
Index Cond: (m_5.txn_id =
'9999999999999999999'::bigint)
-> Index Scan using main_202507_pkey on app.main_202507
m_6 (cost=0.56..2.78 rows=1 width=38) (actual time=1.285..1.285 rows=0
loops=1)
Output: m_6.txn_date, m_6.processor_id, m_6.txn_code,
m_6.pos_data, m_6.trans_ref, m_6.sub_txn_type, m_6.txn_id
Index Cond: (m_6.txn_id =
'9999999999999999999'::bigint)
-> Index Scan using main_202508_pkey on app.main_202508
m_7 (cost=0.56..2.78 rows=1 width=38) (actual time=1.010..1.010 rows=0
loops=1)
Output: m_7.txn_date, m_7.processor_id, m_7.txn_code,
m_7.pos_data, m_7.trans_ref, m_7.sub_txn_type, m_7.txn_id
Index Cond: (m_7.txn_id =
'9999999999999999999'::bigint)
-> Index Scan using main_202509_pkey on app.main_202509
m_8 (cost=0.43..2.65 rows=1 width=37) (actual time=1.243..1.245 rows=1
loops=1)
Output: m_8.txn_date, m_8.processor_id, m_8.txn_code,
m_8.pos_data, m_8.trans_ref, m_8.sub_txn_type, m_8.txn_id
Index Cond: (m_8.txn_id =
'9999999999999999999'::bigint)
-> Seq Scan on app.main_202510 m_9 (cost=0.00..0.00 rows=1
width=300) (actual time=0.057..0.057 rows=0 loops=1)
Output: m_9.txn_date, m_9.processor_id, m_9.txn_code,
m_9.pos_data, m_9.trans_ref, m_9.sub_txn_type, m_9.txn_id
Filter: (m_9.txn_id = '9999999999999999999'::bigint)
-> Seq Scan on app.main_202511 m_10 (cost=0.00..0.00
rows=1 width=300) (actual time=0.045..0.045 rows=0 loops=1)
Output: m_10.txn_date, m_10.processor_id,
m_10.txn_code, m_10.pos_data, m_10.trans_ref, m_10.sub_txn_type, m_10.txn_id
Filter: (m_10.txn_id = '9999999999999999999'::bigint)
-> Seq Scan on app.main_202512 m_11 (cost=0.00..0.00
rows=1 width=300) (actual time=0.076..0.076 rows=0 loops=1)
Output: m_11.txn_date, m_11.processor_id,
m_11.txn_code, m_11.pos_data, m_11.trans_ref, m_11.sub_txn_type, m_11.txn_id
Filter: (m_11.txn_id = '9999999999999999999'::bigint)
-> Seq Scan on app.main_default m_12 (cost=0.00..0.00
rows=1 width=300) (actual time=0.047..0.047 rows=0 loops=1)
Output: m_12.txn_date, m_12.processor_id,
m_12.txn_code, m_12.pos_data, m_12.trans_ref, m_12.sub_txn_type, m_12.txn_id
Filter: (m_12.txn_id = '9999999999999999999'::bigint)

On Tue, 9 Sept, 2025, 8:42 am Ashutosh Bapat, <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
wrote:

> On Mon, Sep 8, 2025 at 5:09 PM Vivek Gadge <vvkgadge56(at)gmail(dot)com> wrote:
> >
> >
> > For example, when a query runs on a partitioned table, PostgreSQL scans
> partitions in the order they were created or attached to the parent table.
> In our case (monthly partitions from January through September), this means
> that queries looking for recent data (e.g., September) may experience
> additional overhead. PostgreSQL evaluates the older partitions first,
> checking their constraints and in some cases probing their indexes, before
> reaching the later partitions that actually contain the needed data.
> >
> > As a result, while the query results are correct, the execution time
> increases due to unnecessary work on irrelevant partitions. This
> performance impact is more noticeable when the target partition is at the
> end of the scan order and pruning cannot fully eliminate the earlier
> partitions.
> >
>
> If you don't want data from certain partitions maybe you should add a
> clause that will help partition pruning. If you need data from all
> partitions, the order in which they are scanned doesn't matter, those
> will be scanned either way.
>
> If partitioning pruning isn't working for you, please report the exact
> query. Please provide example queries anyway.
>
> --
> Best Wishes,
> Ashutosh Bapat
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2025-09-09 14:00:04 Re: eliminate xl_heap_visible to reduce WAL (and eventually set VM on-access)
Previous Message Melanie Plageman 2025-09-09 13:39:19 Re: Checkpointer write combining