|From:||Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>|
|To:||David Rowley <dgrowleyml(at)gmail(dot)com>|
|Cc:||Dmitriy Kuzmin <kuzmin(dot)db4(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org|
|Subject:||Re: Startup process on a hot standby crashes with an error "invalid memory alloc request size 1073741824" while replaying "Standby/LOCK" records|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
[ redirecting to -hackers because patch attached ]
David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> So that confirms there were 950k relations in the xl_standby_locks.
> The contents of that message seem to be produced by standby_desc().
> That should be the same WAL record that's processed by standby_redo()
> which adds the 950k locks to the RecoveryLockListsEntry.
> I'm not seeing why 950k becomes 134m.
I figured out what the problem is. The standby's startup process
retains knowledge of all these locks in standby.c's RecoveryLockLists
data structure, which *has no de-duplication capability*. It'll add
another entry to the per-XID list any time it's told about a given
exclusive lock. And checkpoints cause us to regurgitate the entire
set of currently-held exclusive locks into the WAL. So if you have
a process holding a lot of exclusive locks, and sitting on them
across multiple checkpoints, standby startup processes will bloat.
It's not a true leak, in that we know where the memory is and
we'll release it whenever we see that XID commit/abort. And I doubt
that this is a common usage pattern, which probably explains the
lack of previous complaints. Still, bloat bad.
PFA a quick-hack fix that solves this issue by making per-transaction
subsidiary hash tables. That's overkill perhaps; I'm a little worried
about whether this slows down normal cases more than it's worth.
But we ought to do something about this, because aside from the
duplication aspect the current storage of these lists seems mighty
regards, tom lane
|Next Message||Tom Lane||2022-10-04 23:53:11||Re: Startup process on a hot standby crashes with an error "invalid memory alloc request size 1073741824" while replaying "Standby/LOCK" records|
|Previous Message||David G. Johnston||2022-10-03 22:04:27||Re: BUG #17626: Permission denied errors should list role as well as user|
|Next Message||Nathan Bossart||2022-10-04 22:54:20||Re: Move backup-related code to xlogbackup.c/.h|
|Previous Message||Nathan Bossart||2022-10-04 22:32:24||Re: [PATCH] Expand character set for ltree labels|