Re: BUG #18283: vacuum full use a large amount of memory (may cause OOM)

From: Yang Zhu <fairyfar(at)msn(dot)com>
To: Yang Zhu <fairyfar(at)msn(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #18283: vacuum full use a large amount of memory (may cause OOM)
Date: 2024-01-17 06:07:54
Message-ID: SY5P282MB47277A4E5D1CF22B263FF240D5722@SY5P282MB4727.AUSP282.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The test case in the original report was flawed and I have updated it.

Step 1. Create a new session("Sess 1"), then execute the following query and keep the connection:

```sql
-- create table and insert data for testing.
create table t1 (
c1 character varying(100),
c2 character varying(100),
c3 character varying(100),
c4 character varying(100),
c5 character varying(100),
c6 character varying(100)
);

create function randomtext(len int) returns text as $$
select string_agg(md5(random()::text),'') from generate_series(1,$1/32)
$$ language sql;

insert into t1 select
randomtext(34),randomtext(34),randomtext(34),randomtext(34),randomtext(34),'RST'
from generate_series(1,1000000);
```

Step 2. Create another session("Sess 2"). Start a long transaction, then create any table, and keep the connection:

```sql
BEGIN;
create table t2(a int);
```

Step 3. Go back to "Sess 1" and continue:

```sql
-- Get the PID of the current backend
postgres=# select pg_backend_pid();
pg_backend_pid
----------------
10511
(1 row)

-- Update 5 times
update t1 set c2=randomtext(34) where c6='RST';
update t1 set c2=randomtext(34) where c6='RST';
update t1 set c2=randomtext(34) where c6='RST';
update t1 set c2=randomtext(34) where c6='RST';
update t1 set c2=randomtext(34) where c6='RST';

vacuum full t1;
```

Step 4. Create a new bash terminal and view the memory usage of PID 10511
during the execution of 'vacumm full':

```bash
[yz(at)bogon ~]$ top -p 10511

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
10511 yz 20 0 1853340 1.6g 141488 D 15.7 21.4 0:14.92 postgres
```

Step 3 & 4 can be repeated, and the peak memory of 'vacumm full' will continue to increase.

________________________________
发件人: PG Bug reporting form <noreply(at)postgresql(dot)org>
发送时间: 2024年1月11日 14:47
收件人: pgsql-bugs(at)lists(dot)postgresql(dot)org <pgsql-bugs(at)lists(dot)postgresql(dot)org>
抄送: fairyfar(at)msn(dot)com <fairyfar(at)msn(dot)com>
主题: BUG #18283: vacuum full use a large amount of memory (may cause OOM)

The following bug has been logged on the website:

Bug reference: 18283
Logged by: Zhu Yang
Email address: fairyfar(at)msn(dot)com
PostgreSQL version: 16.1
Operating system: Red Hat Enterprise Linux Server 7.6 (Maipo)
Description:

Under certain conditions, a vacuum full will use a lot of memory. The memory
usage is out of control, and an OOM may occur.

Step to reproduce the behavior:

Step 1. Create a new session("Sess 1"), then execute the following query and
keep the connection:
```sql
-- create table and insert data for testing.
create table t1 (
c1 character varying(100),
c2 character varying(100),
c3 character varying(100),
c4 character varying(100),
c5 character varying(100),
c6 character varying(100)
);

create function randomtext(len int) returns text as $$
select string_agg(md5(random()::text),'') from generate_series(1,$1/32)
$$ language sql;

insert into t1 select
randomtext(34),randomtext(34),randomtext(34),randomtext(34),randomtext(34),'RST'
from generate_series(1,1000000);
```

Step 2. Create another session("Sess 2"). Start a long transaction, then
create any table, and keep the connection:

```sql
BEGIN;
create table t2(a int);
```

Step 3. Go back to "Sess 1" and continue:

```sql
-- Get the PID of the current backend
postgres=# select pg_backend_pid();
pg_backend_pid
----------------
10511
(1 row)

-- Can update multiple times
update t1 set c2=randomtext(34) where k1='RST';
update t1 set c2=randomtext(34) where k1='RST';

vacuum full t1;
```

Step 4. Create a new bash terminal and view the memory usage of PID 10511
during the execution of 'vacumm full':

```bash
[yz(at)bogon ~]$ top -p 10511

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
10511 yz 20 0 1853340 1.6g 141488 D 15.7 21.4 0:14.92
postgres
```

You can observe that 'vacumm full' memory usage (VIRT & RES) is very high,
with the peak memory usage increasing with each Step 3 execution.

After analysis, the MemoryContext that consumes significant memory is "Table
rewrite", and the stack is:

```
#0 0x000000000050f700 in heap_copytuple ()
#1 0x000000000054f452 in rewrite_heap_tuple ()
#2 0x000000000054844f in reform_and_rewrite_tuple.isra.0 ()
#3 0x00000000005488e0 in heapam_relation_copy_for_cluster ()
#4 0x0000000000616760 in copy_table_data ()
#5 0x0000000000617846 in cluster_rel ()
#6 0x0000000000676973 in vacuum_rel ()
#7 0x0000000000677b9c in vacuum ()
#8 0x00000000006782dc in ExecVacuum ()
#9 0x0000000000808859 in standard_ProcessUtility ()
#10 0x0000000000806f5f in PortalRunUtility ()
#11 0x000000000080708b in PortalRunMulti ()
#12 0x000000000080755d in PortalRun ()
#13 0x0000000000803b28 in exec_simple_query ()
...
```

The code that causes the problem is in
src/backend/access/heap/rewriteheap.c:

```c
void
rewrite_heap_tuple(RewriteState state,
HeapTuple old_tuple, HeapTuple new_tuple)
{
...
if (!((old_tuple->t_data->t_infomask & HEAP_XMAX_INVALID) ||
HeapTupleHeaderIsOnlyLocked(old_tuple->t_data)) &&
!HeapTupleHeaderIndicatesMovedPartitions(old_tuple->t_data) &&
!(ItemPointerEquals(&(old_tuple->t_self),
&(old_tuple->t_data->t_ctid))))
{
// If the code executes inside this block, the allocated memory will not
be freed until the query ends.
...
```

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tender Wang 2024-01-17 07:48:23 Re: BUG #18297: Error when adding a column to a parent table with complex inheritance
Previous Message Michael Paquier 2024-01-16 23:59:26 Re: BUG #18274: Error 'invalid XML content'