| From: | yudhi s <learnerdatabase99(at)gmail(dot)com> |
|---|---|
| To: | Ron Johnson <ronljohnsonjr(at)gmail(dot)com> |
| Cc: | "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Top -N Query performance issue and high CPU usage |
| Date: | 2026-02-03 09:26:14 |
| Message-ID: | CAEzWdqc3hdnZbHTLd1kXxR3FUdxGpM7=Eea9c8Hp7QsF+HKMtg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
On Tue, Feb 3, 2026 at 4:50 AM Ron Johnson <ronljohnsonjr(at)gmail(dot)com> wrote:
> On Mon, Feb 2, 2026 at 3:43 PM yudhi s <learnerdatabase99(at)gmail(dot)com>
> wrote:
>
>>
>> On Tue, Feb 3, 2026 at 1:01 AM Ron Johnson <ronljohnsonjr(at)gmail(dot)com>
>> wrote:
>>
>>> On Mon, Feb 2, 2026 at 1:39 PM yudhi s <learnerdatabase99(at)gmail(dot)com>
>>> wrote:
>>>
>>>> On Mon, Feb 2, 2026 at 8:57 PM Ron Johnson <ronljohnsonjr(at)gmail(dot)com>
>>>> wrote:
>>>>
>>>>>
>>>>>> My apologies if i misunderstand the plan, But If I see, it's
>>>>>> spending ~140ms(140ms-6ms) i.e. almost all the time now, in performing the
>>>>>> below nested loop join. So my question was , is there any possibility to
>>>>>> reduce the resource consumption or response time further here? Hope my
>>>>>> understanding is correct here.
>>>>>>
>>>>>> -> Nested Loop (cost=266.53..1548099.38 rows=411215 width=20) (actual
>>>>>> time=*6.009..147.695* rows=1049 loops=1)
>>>>>> Join Filter: ((df.ent_id)::numeric = m.ent_id)
>>>>>> Rows Removed by Join Filter: 513436
>>>>>> Buffers: shared hit=1939
>>>>>>
>>>>>
>>>>> I don't see m.ent_id in the actual query. Did you only paste a
>>>>> portion of the query?
>>>>>
>>>>> Also, casting in a JOIN typically brutalizes the ability to use an
>>>>> index.
>>>>>
>>>>>
>>>>> Thank you.
>>>> Actually i tried executing the first two CTE where the query was
>>>> spending most of the time and teh alias has changed.
>>>>
>>>
>>> We need to see everything, not just what you think is relevant.
>>>
>>>
>>>> Also here i have changed the real table names before putting it here,
>>>> hope that is fine.
>>>> However , i verified the data type of the ent_id column in "ent" its
>>>> "int8" and in table "txn_tbl" is "numeric 12", so do you mean to say this
>>>> difference in the data type is causing this high response time during the
>>>> nested loop join? My understanding was it will be internally castable
>>>> without additional burden. Also, even i tried creating an index on the "(df.ent_id)::numeric"
>>>> its still reulting into same plan and response time.
>>>>
>>>
>>> If you'd shown the "\d" table definitions like Adrian asked two days
>>> ago, we'd know what indexes are on each table, and not have to beg you to
>>> dispense dribs and drabs of information.
>>>
>>>
>> I am unable to run "\d" from the dbeaver sql worksheet. However, I have
>> fetched the DDL for the three tables and their selected columns, used in
>> the smaller version of the query and its plan , which I recently updated.
>>
>> https://gist.github.com/databasetech0073/e4290b085f8f974e315fb41bdc47a1f3
>>
>> https://gist.github.com/databasetech0073/344df46c328e02b98961fab0cd221492
>>
>
> Lines 30-32 are where most of the time and effort are taken.
>
> I can't be certain, but changing APP_schema.ent.ent_id from NUMERIC to
> int8 (with a CHECK constraint to, well, constrain it to 12 digits, if
> really necessary) is something I'd test.
>
> --
>
Thank you so much.
After making the data types equal on both tables for the column ent_id the
plan now looks as below. The costing function sinow removed. So it must be
helping reduce CPU cycle consumption to some extent, But, I still see
~100ms is spent in this step. Is there anything we can do to further drop
the response time here? Or it's the best time we can get here.
-> Nested Loop (cost=262.77..1342550.91 rows=579149 width=20) (*actual
time=6.406..107.946* rows=1049 loops=1)
Join Filter: (*df.ent_id = m.ent_id*)
Rows Removed by Join Filter: 514648
Buffers: shared hit=1972
Also I do see in some other steps in the plan , the casting function is
getting used. For example in the below filter. Here txn_tbl_type_nm is
defined as Varchar(25) and still it's trying to cast it to Text. Can we do
anything to avoid these force casts as these must consume the CPU cycles?
AND txn_tbl_dcsn.txn_tbl_txn_sts_tx NOT IN ('STATUS_A','STATUS_B')
WHERE txn_tbl.txn_tbl_type_nm IN ('TYPE1','TYPE2','TYPE3')
-> Index Scan Backward using txn_tbl_due_dt_idx on txn_tbl df
(cost=0.43..115879.87 rows=1419195 width=20) (actual time=0.019..20.377
rows=43727 loops=1)
Filter: *((txn_tbl_type_nm)::text = ANY ('{TYPE1,TYPE2,TYPE3}'::text[])*)
Rows Removed by Filter: 17
Buffers: shared hit=1839
The plan is as below.
https://gist.github.com/databasetech0073/558377c1939a9291e7b72b1cbac7c9f9
Regards
Yudhi
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ron Johnson | 2026-02-03 15:59:51 | Re: Top -N Query performance issue and high CPU usage |
| Previous Message | Ron Johnson | 2026-02-02 23:19:50 | Re: Top -N Query performance issue and high CPU usage |