Logical Replication Memory Allocation Error - "invalid memory alloc request size"

From: Max Madden <maxmmadden(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Logical Replication Memory Allocation Error - "invalid memory alloc request size"
Date: 2025-06-10 15:37:03
Message-ID: CAD1FGCT2sYrP_70RTuo56QTizyc+J3wJdtn2gtO3VttQFpdMZg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

I'm encountering a consistent issue with PostgreSQL 15 logical replication
and would appreciate any guidance on debugging or resolving this problem.

*Setup:*
- Source: PostgreSQL 15.x
- Target: PostgreSQL 15.x
- Replication: Logical replication using publication/subscription (pgoutput)
- Tables: 3 tables (details below)

*Table Details:*
- Table 1: ~1,300 records, 7 columns, no large objects
- Table 2: ~100,000 records, 7 columns, no large objects
- Table 3: ~100,000 records, 17 columns, no large objects

*Problem:*

The initial snapshot and data copy complete successfully for all tables.
However, anywhere from 5 minutes to 2 hours after the initial sync, the
subscription consistently fails with memory allocation errors like:

```
2025-06-10 14:14:56.800 UTC [299] ERROR: could not receive data from WAL
stream: ERROR: invalid memory alloc request size 1238451248
2025-06-10 14:14:56.805 UTC [1] LOG: background worker "logical replication
worker" (PID 299) exited with exit code 1
```

This occurs whether I replicate all 3 tables together or individually.

My initial hypothesis is that large transactions are creating WAL segments
that exceed memory limits when sent to the subscriber. However, I haven't
been able to confirm this / find the cause.

*Questions:*
1. What's the best approach to debug this memory allocation issue?
2. Are there specific PostgreSQL settings I should check ?
3. How can I identify if large transactions are indeed the root cause?

*Additional Context:*
- This happens consistently across multiple replication attempts
- The error size varies but is always requesting > 1GB
- No custom logical replication settings currently applied
- Subscriber machine has 256 GB of RAM and Ubuntu 20.04
- Can recreate it on different machines

I should also mention that we're operating in a managed environment on
DigitalOcean, which means we don't have direct access to the WAL logs on
the publisher node. This is why the log information above is limited. I
understand this constraint makes it more difficult to provide help, but I
would really appreciate any insights or suggestions you might have.

Thanks,

Max

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dominique Devienne 2025-06-10 15:46:50 Re: is pg_stat_activity "transactional"? How fast does it update?
Previous Message Tom Lane 2025-06-10 14:27:57 Re: is pg_stat_activity "transactional"? How fast does it update?