From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Add memory_limit_hits to pg_stat_replication_slots |
Date: | 2025-09-25 10:17:51 |
Message-ID: | aNUWz4wbJsgxkA8T@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Wed, Sep 24, 2025 at 10:11:20AM -0700, Masahiko Sawada wrote:
> On Tue, Sep 23, 2025 at 11:31 PM Bertrand Drouvot
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> Thank you for updating the patch! Here are some comments:
>
> ---
> + bool memory_limit_reached = (rb->size >=
> logical_decoding_work_mem * (Size) 1024);
> +
> + if (memory_limit_reached)
> + rb->memExceededCount += 1;
>
> Do we want to use 'exceeded' for the variable too for better consistency?
I thought about it, but since we use ">=" I think that "reached" is more
accurate. So I went for "reached" for this one and "exceeded" for "user facing"
ones. That said I don't have a strong opinion about it, and I'd be ok to use
"exceeded" if you feel strong about it.
> ---
> One thing I want to clarify is that even if the memory usage exceeds
> the logical_decoding_work_mem it doesn't necessarily mean we serialize
> or stream transactions because of
> ReorderBufferCheckAndTruncateAbortedTXN().
Right.
> For example, in a situation
> where many large already-aborted transactions are truncated by
> transactionsReorderBufferCheckAndTruncateAbortedTXN(), users would see
> a high number in mem_exceeded_count column but it might not actually
> require any adjustment for logical_decoding_work_mem.
Yes, but in that case mem_exceeded_count would be high compared to spill_txns,
stream_txns, no?
> One idea is to
> increment that counter if exceeding memory usage is caused to
> serialize or stream any transactions. On the other hand, it might make
> sense and be straightforward too to show a pure statistic that the
> memory usage exceeded the logical_decoding_work_mem. What do you
> think?
The new counter, as it is proposed, helps to see if the workload hits the
logical_decoding_work_mem frequently or not. I think it's valuable information
to have on its own.
Now to check if logical_decoding_work_mem needs adjustment, one could compare
mem_exceeded_count with the existing spill_txns and stream_txns.
For example, If I abort 20 transactions that exceeded logical_decoding_work_mem
, I'd get:
postgres=# select spill_txns,stream_txns,mem_exceeded_count from pg_stat_replication_slots ;
spill_txns | stream_txns | mem_exceeded_count
------------+-------------+--------------------
0 | 0 | 20
(1 row)
That way I could figure out that mem_exceeded_count has been reached for
aborted transactions.
OTOH, If one see spill_txns + stream_txns close to mem_exceeded_count, like:
postgres=# select spill_txns,stream_txns,mem_exceeded_count from pg_stat_replication_slots ;
spill_txns | stream_txns | mem_exceeded_count
------------+-------------+--------------------
38 | 20 | 58
(1 row)
That probably means that mem_exceeded_count would need to be increased.
What do you think?
BTW, while doing some tests for the above examples, I noticed that the patch
was missing a check on memExceededCount in UpdateDecodingStats() (that produced
mem_exceeded_count being 0 for one of the new test in test_decoding): Fixed in
v5 attached.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v5-0001-Add-mem_exceeded_count-to-pg_stat_replication_slo.patch | text/x-diff | 22.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Jones | 2025-09-25 10:45:20 | Re: Fix bug with accessing to temporary tables of other sessions |
Previous Message | Chao Li | 2025-09-25 10:09:44 | Re: Fix incorrect function comment of stringToNodeInternal |