Re: Add memory_limit_hits to pg_stat_replication_slots

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Add memory_limit_hits to pg_stat_replication_slots
Date: 2025-09-25 10:17:51
Message-ID: aNUWz4wbJsgxkA8T@ip-10-97-1-34.eu-west-3.compute.internal
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wed, Sep 24, 2025 at 10:11:20AM -0700, Masahiko Sawada wrote:
> On Tue, Sep 23, 2025 at 11:31 PM Bertrand Drouvot
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> Thank you for updating the patch! Here are some comments:
>
> ---
> + bool memory_limit_reached = (rb->size >=
> logical_decoding_work_mem * (Size) 1024);
> +
> + if (memory_limit_reached)
> + rb->memExceededCount += 1;
>
> Do we want to use 'exceeded' for the variable too for better consistency?

I thought about it, but since we use ">=" I think that "reached" is more
accurate. So I went for "reached" for this one and "exceeded" for "user facing"
ones. That said I don't have a strong opinion about it, and I'd be ok to use
"exceeded" if you feel strong about it.

> ---
> One thing I want to clarify is that even if the memory usage exceeds
> the logical_decoding_work_mem it doesn't necessarily mean we serialize
> or stream transactions because of
> ReorderBufferCheckAndTruncateAbortedTXN().

Right.

> For example, in a situation
> where many large already-aborted transactions are truncated by
> transactionsReorderBufferCheckAndTruncateAbortedTXN(), users would see
> a high number in mem_exceeded_count column but it might not actually
> require any adjustment for logical_decoding_work_mem.

Yes, but in that case mem_exceeded_count would be high compared to spill_txns,
stream_txns, no?

> One idea is to
> increment that counter if exceeding memory usage is caused to
> serialize or stream any transactions. On the other hand, it might make
> sense and be straightforward too to show a pure statistic that the
> memory usage exceeded the logical_decoding_work_mem. What do you
> think?

The new counter, as it is proposed, helps to see if the workload hits the
logical_decoding_work_mem frequently or not. I think it's valuable information
to have on its own.

Now to check if logical_decoding_work_mem needs adjustment, one could compare
mem_exceeded_count with the existing spill_txns and stream_txns.

For example, If I abort 20 transactions that exceeded logical_decoding_work_mem
, I'd get:

postgres=# select spill_txns,stream_txns,mem_exceeded_count from pg_stat_replication_slots ;
spill_txns | stream_txns | mem_exceeded_count
------------+-------------+--------------------
0 | 0 | 20
(1 row)

That way I could figure out that mem_exceeded_count has been reached for
aborted transactions.

OTOH, If one see spill_txns + stream_txns close to mem_exceeded_count, like:

postgres=# select spill_txns,stream_txns,mem_exceeded_count from pg_stat_replication_slots ;
spill_txns | stream_txns | mem_exceeded_count
------------+-------------+--------------------
38 | 20 | 58
(1 row)

That probably means that mem_exceeded_count would need to be increased.

What do you think?

BTW, while doing some tests for the above examples, I noticed that the patch
was missing a check on memExceededCount in UpdateDecodingStats() (that produced
mem_exceeded_count being 0 for one of the new test in test_decoding): Fixed in
v5 attached.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v5-0001-Add-mem_exceeded_count-to-pg_stat_replication_slo.patch text/x-diff 22.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Jones 2025-09-25 10:45:20 Re: Fix bug with accessing to temporary tables of other sessions
Previous Message Chao Li 2025-09-25 10:09:44 Re: Fix incorrect function comment of stringToNodeInternal