Decoding speculative insert with toast leaks memory

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Decoding speculative insert with toast leaks memory
Date: 2021-03-25 05:33:58
Message-ID: CAExHW5sPKF-Oovx_qZe4p5oM6Dvof7_P+XgsNAViug15Fm99jA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi All,
We saw OOM in a system where WAL sender consumed Gigabttes of memory
which was never released. Upon investigation, we found out that there
were many ReorderBufferToastHash memory contexts linked to
ReorderBuffer context, together consuming gigs of memory. They were
running INSERT ... ON CONFLICT .. among other things. A similar report
at [1]

I could reproduce a memory leak in wal sender using following steps
Session 1
postgres=# create table t_toast (a int primary key, b text);
postgres=# CREATE PUBLICATION dbz_minimal_publication FOR TABLE public.t_toast;

Terminal 4
$ pg_recvlogical -d postgres --slot pgoutput_minimal_test_slot
--create-slot -P pgoutput
$ pg_recvlogical -d postgres --slot pgoutput_minimal_test_slot --start
-o proto_version=1 -o publication_names='dbz_minimal_publication' -f
/dev/null

Session 1
postgres=# select * from pg_replication_slots ;
slot_name | plugin | slot_type | datoid | database
| temporary | active | active_pid | xmin | catalog_xmin | restart_lsn
| confirmed_flush_lsn
----------------------------+----------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
pgoutput_minimal_test_slot | pgoutput | logical | 12402 | postgres
| f | f | | | 570 | 0/15FFFD0
| 0/1600020

postgres=# begin;
postgres=# insert into t_toast values (500, repeat('something' ||
txid_current()::text, 100000)) ON CONFLICT (a) DO NOTHING;
INSERT 0 1

Session 2 (xid = 571)
postgres=# begin;
postgres=# insert into t_toast values (500, repeat('something' ||
txid_current()::text, 100000)) ON CONFLICT (a) DO NOTHING;

Session 3 (xid = 572)
postgres=# begin;
postgres=# insert into t_toast values (500, repeat('something' ||
txid_current()::text, 100000)) ON CONFLICT (a) DO NOTHING;

Session 1 (this session doesn't modify the table but is essential for
speculative insert to happen.)
postgres=# rollback;

Session 2 and 3 in the order in which you get control back (in my case
session 2 with xid = 571)
INSERT 0 1
postgres=# commit;
COMMIT

other session (in my case session 3 with xid = 572)
INSERT 0 0
postgres=# commit;
COMMIT

With the attached patch, we see following in the server logs
2021-03-25 09:57:20.469 IST [12424] LOG: starting logical decoding
for slot "pgoutput_minimal_test_slot"
2021-03-25 09:57:20.469 IST [12424] DETAIL: Streaming transactions
committing after 0/1600020, reading WAL from 0/15FFFD0.
2021-03-25 09:57:20.469 IST [12424] LOG: logical decoding found
consistent point at 0/15FFFD0
2021-03-25 09:57:20.469 IST [12424] DETAIL: There are no running transactions.
2021-03-25 09:59:45.494 IST [12424] LOG: initializing hash table for
transaction 571
2021-03-25 09:59:45.494 IST [12424] LOG: speculative insert
encountered in transaction 571
2021-03-25 09:59:45.494 IST [12424] LOG: speculative insert confirmed
in transaction 571
2021-03-25 09:59:45.504 IST [12424] LOG: destroying toast hash for
transaction 571
2021-03-25 09:59:50.806 IST [12424] LOG: initializing hash table for
transaction 572
2021-03-25 09:59:50.806 IST [12424] LOG: speculative insert
encountered in transaction 572
2021-03-25 09:59:50.806 IST [12424] LOG: toast hash for transaction
572 is not cleared

Observe that the toast_hash was cleaned for the transaction 571 which
successfully inserted the row but was not cleaned for the transaction
572 which performed DO NOTHING instead of INSERT.

Here's the sequence of events which leads to memory leak in wal sender
1. Transaction 571 performs a speculative INSERT which is decoded as
toast insert followed by speculative insert of row
2. decoding toast tuple, causes the toast hash to be created
3. Speculative insert is ignored while decoding
4. Speculative insert is confimed and decoded as a normal INSERT, also
destroying the toast hash
5. Transaction 572 performs speculative insert which is decoded as
toast insert followed by speculative insert of row
6. decoding toast tuple causes the toast hash to be created
7. speculative insert is ignored while decoding
... Speculative INSERT is never confirmed and thus toast hash is never
destroyed leaking memory

In memory context dump we see as many ReorderBufferToastHash as the
number of times the above sequence is repeated.
TopMemoryContext: 1279640 total in 7 blocks; 23304 free (17 chunks);
1256336 used
...
Replication command context: 32768 total in 3 blocks; 10952 free
(9 chunks); 21816 used
...
ReorderBuffer: 8192 total in 1 blocks; 7656 free (7 chunks); 536 used
ReorderBufferToastHash: 8192 total in 1 blocks; 2056 free (0
chunks); 6136 used
ReorderBufferToastHash: 8192 total in 1 blocks; 2056 free (0
chunks); 6136 used
ReorderBufferToastHash: 8192 total in 1 blocks; 2056 free (0
chunks); 6136 used

The relevant code is all in ReoderBufferCommit() in cases
REORDER_BUFFER_CHANGE_INSERT,
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_INSERT and
REORDER_BUFFER_CHANGE_INTERNAL_SPEC_CONFIRM.

About the solution: The speculative insert needs to be ignored since
it can be rolled back later. If speculative insert is not confirmed,
there is no way to know that the speculative insert change required a
toast_hash table and destroy it before the next change starts.
ReorderBufferCommit seems to notice a speculative insert that was
never confirmed in the following code
1624 change_done:
1625
1626 /*
1627 * Either speculative insertion was
confirmed, or it was
1628 * unsuccessful and the record isn't needed anymore.
1629 */
1630 if (specinsert != NULL)
1631 {
1632 ReorderBufferReturnChange(rb, specinsert);
1633 specinsert = NULL;
1634 }
1635
1636 if (relation != NULL)
1637 {
1638 RelationClose(relation);
1639 relation = NULL;
1640 }
1641 break;

but by then we might have reused the toast_hash and thus can not be
destroyed. But that isn't the problem since the reused toast_hash will
be destroyed eventually.

It's only when the change next to speculative insert is something
other than INSERT/UPDATE/DELETE that we have to worry about a
speculative insert that was never confirmed. So may be for those
cases, we check whether specinsert != null and destroy toast_hash if
it exists.

[1] https://www.postgresql-archive.org/Diagnose-memory-leak-in-logical-replication-td6161318.html

--
Best Wishes,
Ashutosh Bapat

Attachment Content-Type Size
rb_toas_hash_mem_leak.patch text/x-patch 11.9 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2021-03-25 05:41:59 Re: Proposal: Save user's original authenticated identity for logging
Previous Message Kyotaro Horiguchi 2021-03-25 05:15:03 Re: In-placre persistance change of a relation