| From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
|---|---|
| To: | shawn wang <shawn(dot)wang(dot)pg(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Add logical_decoding_spill_limit to cap spill file disk usage per slot |
| Date: | 2026-06-25 18:41:49 |
| Message-ID: | CALj2ACX6txtXhvboJAU4TtFeodqKD4qrEM2NPiQh0N-Zkg8YUA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Fri, Apr 3, 2026 at 9:14 AM shawn wang <shawn(dot)wang(dot)pg(at)gmail(dot)com> wrote:
>
> > Having a lot of spill files also increases crash/recovery times.
> > However, files spilling to disk causing no-space-left-on-disk issues
> > leading to downtime applies to WAL files, historical catalog snapshot
> > files, subtransaction overflow files, CLOG (and all the subsystems
> > backed by SLRU data structure), etc. - basically any Postgres
> > subsystem writing files to disk. I'm a bit worried that we may end up
> > solving disk space issues, which IMHO are outside of the database
> > scope, in the database. Others may have different opinions though.
>
> That's a fair philosophical point. However, I'd argue that spill files
> are uniquely unprotected compared to the other subsystems you mentioned:
>
> - WAL files → max_slot_wal_keep_size
> - Temp files → temp_file_limit
> - SLRU/CLOG → bounded by design (sized proportionally to active XIDs)
>
> Spill files are the gap: they can grow proportionally to the *data
> volume* of in-progress transactions, not to the number of transactions,
> and there is no existing upper bound. A single bulk-load transaction
> replicated via logical decoding can generate tens of GBs of spill files
> with no way for the DBA to cap it.
Fair argument.
I will also wait for others to provide thoughts on this proposal.
> > How common is this issue? Could you please add a test case to the
> > proposed patch that without this feature would otherwise hit the issue
> > described?
>
> Typically triggered by bulk data migrations or long-running DDL on large tables,
> combined with a subscriber that fell behind. Each incident caused the instance
> to go read-only, affecting all workloads — not just replication.
I'm trying to understand the use-case better here. How large is the
transaction doing the bulk data load or long-running DDL (in terms of
amount of data being loaded, and for DDL - whether it's a table
rewrite, the sizes, and the time taken - approximate numbers are
sufficient)? What was the logical_decoding_work_mem setting when you
encountered the issue? Also, was streaming=parallel or streaming=on
enabled, and if not, why was it not possible to enable streaming if
using native logical replication, increase logical_decoding_work_mem,
or increase the disk space? This sets good context for the proposal.
> cenario (b) is strictly better than (a): it converts an
> instance-wide outage into a replication-only interruption with a
> clear, actionable error message.
>
> Additionally, in the updated patch (v2), I have ensured that spill
>
> files are properly cleaned up in the error path (via
> WalSndErrorCleanup), so exceeding the limit does not leave orphaned
> files on disk.
1/
+ ereport(ERROR,
+ (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+ errmsg("logical decoding spill file size limit exceeded"),
+ errdetail("Current on-disk spill size is %zu bytes, "
+ "transaction to spill is %zu bytes, "
+ "limit is %d kB.",
+ rb->spillBytesOnDisk, txn->size,
+ logical_decoding_spill_limit),
+ errhint("Consider increasing %s, %s, "
+ "or using a streaming-capable output plugin.",
+ "logical_decoding_spill_limit",
+ "logical_decoding_work_mem")));
I don't think just specifying "streaming-capable output plugin" is a
good idea here - what if they are already using streaming but still
hitting this ERROR? The most important thing for the user here is to
understand what actually caused the issue and fix the root cause,
rather than just increasing the spill limit and hitting the same issue
again later.
Also, it's better to include all the possible reasons (like the XID
wraparound warnings do) that would lead to this situation - such as a
large transaction (e.g., bulk data load or a long-running DDL)
combined with a slow or unresponsive subscriber (for logical
replication) or logical decoding consumer. Also, give a hint about how
one can check the total spill file size using pg_ls_replslotdir or
similar - this gives better reasoning for the user to act on.
2/
+ rb->spillBytesOnDisk + txn->size >
What happens after a restart for the spilled bytes? How does it
account for the on-disk size after a restart?
3/ There seems to be a design issue here. Say the disk has only 20GB
free, and I set logical_decoding_spill_limit = 5GB thinking that
limits my spill usage. With 5 replication slots (5 walsenders), each
slot can spill up to 5GB independently - so total spill can reach
25GB, which exceeds the 20GB available and fills the disk. The limit
fails at exactly the scenario it's meant to prevent. To stay under
20GB I'd have to hand-compute the per-slot value (20GB / 5 = 4GB), but
that breaks the moment a slot is added or dropped. A per-slot limit
doesn't limit the global resource (disk space) it's supposed to
protect.
Thoughts?
--
Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Bharath Rupireddy | 2026-06-25 19:30:09 | Re: Optimize UUID parse using SIMD |
| Previous Message | Sami Imseih | 2026-06-25 18:35:45 | Re: pg_stat_statements: Remove (errcode...) framing parentheses in erport(...) |