BUG #15660: pg_dump memory leaks when dumping LOBs

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: iserbin(at)bostonsd(dot)ru
Subject: BUG #15660: pg_dump memory leaks when dumping LOBs
Date: 2019-02-27 11:29:16
Message-ID: 15660-db101adab620a17a@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 15660
Logged by: Ilya Serbin
Email address: iserbin(at)bostonsd(dot)ru
PostgreSQL version: 11.2
Operating system: Centos 7
Description:

Hello,
One of our customers faced an unexpected behaviour when using pg_dump to
export one of DBs - pg_dump consumed all the server memory and got
terminated with following error:

....
pg_dump: reading row security enabled for table "public.databasechangelog"
pg_dump: reading policies for table "public.databasechangelog"
pg_dump: reading row security enabled for table "public.OperationAudit"
pg_dump: reading policies for table "public.OperationAudit"
pg_dump: reading large objects
pg_dump: [archiver (db)] query failed: out of memory for query result
pg_dump: [archiver (db)] query was: SELECT oid, (SELECT rolname FROM
pg_catalog.pg_roles WHERE oid = lomowner) AS rolname, lomacl FROM
pg_largeobject_metadata
....
Looking through the plan gave following:

________________________
dbname=# explain SELECT oid, (SELECT rolname FROM pg_catalog.pg_roles WHERE
oid = lomowner) AS rolname, lomacl FROM pg_largeobject_metadata;
QUERY PLAN
----------------------------------------------------------------------------------------
Seq Scan on pg_largeobject_metadata (cost=0.00..531632880.18 rows=517114897
width=40)
SubPlan 1
-> Seq Scan on pg_authid (cost=0.00..1.01 rows=1 width=68)
Filter: (oid = pg_largeobject_metadata.lomowner)
(4 rows)
________________________

So there are 517kk entries in pg_largeobject_metadata. Total db size is
nearly 160GB.
He used 64bit pg_dump and postgresql 9.5.14. We've also asked him to use
10th version of pg_dump, and issue was the same, pg_dump consumed all the
server memory (nearly 60GB) and got terminated:
________________________
-bash-4.2$ pg_dump -V
pg_dump (PostgreSQL) 10.5

pg_dump -h host -p 5435 -Fd dbname -f /opt/pgsql/dump/dbname
out of memory
________________________
Customer was using Red Hat Enterprise Linux Server 7.6 + PostgreSQL 9.5.14

We were able to reproduce the issue. Centos 7 + PostgreSQL 11.2. Server is
VM with 2GB RAM + 1CPU.
Here is test table:

testdbl=# \d+ image
Table "public.image"
Column | Type | Collation | Nullable | Default | Storage | Stats target |
Description
--------+------+-----------+----------+---------+----------+--------------+-------------
name | text | | | | extended |
|
raster | oid | | | | plain |
|

I've inserted 21kk+ entries using following:

DO $$
DECLARE
i INTEGER;
BEGIN
FOR i IN 1 .. 5000000
LOOP
INSERT INTO image (name, raster)
VALUES ('withoutdata', lo_import('/etc/motd'));
END LOOP;
END $$;

/etc/motd is zero-sized file, so after inserting I've got following.

testdbl=# select count(*) from pg_catalog.pg_largeobject;
count
-------
0
(1 row)

testdbl=# select count(*) from pg_catalog.pg_largeobject_metadata;
count
----------
21100107
(1 row)

testdbl=# select count(*) from image;
count
----------
21100107
(1 row)

testdbl=# \l+ testdbl
List of databases
Name | Owner | Encoding | Collate | Ctype | Access
privileges | Size | Tablespace | Description
---------+----------+----------+-------------+-------------+-------------------+---------+------------+-------------
testdbl | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 |
| 2400 MB | pg_default |
(1 row)

testdbl=# \dt+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+-------+-------+----------+---------+-------------
public | image | table | postgres | 1050 MB |
(1 row)

After trying to dump the testdbl database I've got the same error:

[postgres(at)host ~]$ pg_dump -d testdbl -f test1.sql -v
pg_dump: last built-in OID is 16383
pg_dump: reading extensions
pg_dump: identifying extension members
pg_dump: reading schemas
pg_dump: reading user-defined tables
pg_dump: reading user-defined functions
pg_dump: reading user-defined types
pg_dump: reading procedural languages
pg_dump: reading user-defined aggregate functions
pg_dump: reading user-defined operators
pg_dump: reading user-defined access methods
pg_dump: reading user-defined operator classes
pg_dump: reading user-defined operator families
pg_dump: reading user-defined text search parsers
pg_dump: reading user-defined text search templates
pg_dump: reading user-defined text search dictionaries
pg_dump: reading user-defined text search configurations
pg_dump: reading user-defined foreign-data wrappers
pg_dump: reading user-defined foreign servers
pg_dump: reading default privileges
pg_dump: reading user-defined collations
pg_dump: reading user-defined conversions
pg_dump: reading type casts
pg_dump: reading transforms
pg_dump: reading table inheritance information
pg_dump: reading event triggers
pg_dump: finding extension tables
pg_dump: finding inheritance relationships
pg_dump: reading column info for interesting tables
pg_dump: finding the columns and types of table "public.image"
pg_dump: flagging inherited columns in subtables
pg_dump: reading indexes
pg_dump: flagging indexes in partitioned tables
pg_dump: reading extended statistics
pg_dump: reading constraints
pg_dump: reading triggers
pg_dump: reading rewrite rules
pg_dump: reading policies
pg_dump: reading row security enabled for table "public.image"
pg_dump: reading policies for table "public.image"
pg_dump: reading publications
pg_dump: reading publication membership
pg_dump: reading publication membership for table "public.image"
pg_dump: reading subscriptions
pg_dump: reading large objects

out of memory

All the memory is being consumed on the step "pg_dump: reading large
objects". I've tried running pg_dump on another idle server with 10GB of RAM
and dump got created (~5GB size). Peak RAM usage was following:

[postgres(at)host2 ~]$ free -m
total used free shared buff/cache
available
Mem: 9665 8790 292 283 582
340
Swap: 955 38 917

I've tried dumping some usual tables without lobs, an for 2GB table with
20kk entries RAM usage is fluctuating near 10mb.

So, summarizing
1) Affected versions are at least 9.5.14, 10.5 and 11.2. OS's Red Hat 7.6
and Centos 7.
2) Issue raises only for lobs while reading pg_largeobject_metadata.
3) Affected program is pg_dump. If postgres is running on host1 with ram
usage ~100mb, and we will start @pg_dump -h host1@ on host2, ram usage on
host1 won't raise, though consumption will be very high on host2
4) Sometimes I'm getting following error
[postgres(at)host ~]$ pg_dump -d testdbl -f test1.sql -v
Killed

So process is being killed by OOM-killer. This is from /var/log/messages:

Feb 27 04:23:05 host kernel: tuned invoked oom-killer: gfp_mask=0x201da,
order=0, oom_score_adj=0
Feb 27 04:23:05 host kernel: tuned cpuset=/ mems_allowed=0
Feb 27 04:23:05 host kernel: CPU: 1 PID: 1326 Comm: tuned Kdump: loaded Not
tainted 3.10.0-862.el7.x86_64 #1
Feb 27 04:23:05 host kernel: Hardware name: innotek GmbH
VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Feb 27 04:23:05 host kernel: Call Trace:
Feb 27 04:23:05 host kernel: [<ffffffff93d0d768>] dump_stack+0x19/0x1b
Feb 27 04:23:05 host kernel: [<ffffffff93d090ea>] dump_header+0x90/0x229
Feb 27 04:23:05 host kernel: [<ffffffff936f76e2>] ?
ktime_get_ts64+0x52/0xf0
Feb 27 04:23:05 host kernel: [<ffffffff9374e0bf>] ?
delayacct_end+0x8f/0xb0
Feb 27 04:23:05 host kernel: [<ffffffff93797904>]
oom_kill_process+0x254/0x3d0
Feb 27 04:23:05 host kernel: [<ffffffff937973ad>] ?
oom_unkillable_task+0xcd/0x120
Feb 27 04:23:05 host kernel: [<ffffffff93797456>] ?
find_lock_task_mm+0x56/0xc0
Feb 27 04:23:05 host kernel: [<ffffffff93798146>]
out_of_memory+0x4b6/0x4f0
Feb 27 04:23:05 host kernel: [<ffffffff9379ea64>]
__alloc_pages_nodemask+0xaa4/0xbb0
Feb 27 04:23:05 host kernel: [<ffffffff937e8528>]
alloc_pages_current+0x98/0x110
Feb 27 04:23:05 host kernel: [<ffffffff93793cf7>]
__page_cache_alloc+0x97/0xb0
Feb 27 04:23:05 host kernel: [<ffffffff937963f8>]
filemap_fault+0x298/0x490
Feb 27 04:23:05 host kernel: [<ffffffffc04e585f>]
xfs_filemap_fault+0x5f/0xe0 [xfs]
Feb 27 04:23:05 host kernel: [<ffffffff937c05fa>]
__do_fault.isra.58+0x8a/0x100
Feb 27 04:23:05 host kernel: [<ffffffff937c0b5c>]
do_read_fault.isra.60+0x4c/0x1a0
Feb 27 04:23:05 host kernel: [<ffffffff937c52dc>]
handle_pte_fault+0x2dc/0xc30
Feb 27 04:23:05 host kernel: [<ffffffff937c747d>]
handle_mm_fault+0x39d/0x9b0
Feb 27 04:23:05 host kernel: [<ffffffff93d1a597>]
__do_page_fault+0x197/0x4f0
Feb 27 04:23:05 host kernel: [<ffffffff93d1a925>] do_page_fault+0x35/0x90
Feb 27 04:23:05 host kernel: [<ffffffff93d16ab6>] ? error_swapgs+0xa7/0xbd
Feb 27 04:23:05 host kernel: [<ffffffff93d16768>] page_fault+0x28/0x30
Feb 27 04:23:05 host kernel: Mem-Info:
Feb 27 04:23:05 host kernel: active_anon:329762 inactive_anon:110659
isolated_anon:0#012 active_file:2 inactive_file:14 isolated_file:0#012
unevictable:0 dirty:0 writeback:0 unstable:0#012 slab_reclaimable:3578
slab_unreclaimable:3636#012 mapped:606 shmem:752 pagetables:3709
bounce:0#012 free:13064 free_pcp:98 free_cma:0
Feb 27 04:23:05 rrulmdbemba01 kernel: Node 0 DMA free:7632kB min:380kB
low:472kB high:568kB active_anon:3860kB inactive_anon:4120kB active_file:0kB
inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB
present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB
mapped:476kB shmem:500kB slab_reclaimable:60kB slab_unreclaimable:88kB
kernel_stack:0kB pagetables:124kB unstable:0kB bounce:0kB free_pcp:0kB
local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0
all_unreclaimable? yes
Feb 27 04:23:05 host kernel: lowmem_reserve[]: 0 1819 1819 1819
Feb 27 04:23:05 host kernel: Node 0 DMA32 free:44624kB min:44672kB
low:55840kB high:67008kB active_anon:1315188kB inactive_anon:438516kB
active_file:8kB inactive_file:56kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:2080704kB managed:1866788kB mlocked:0kB dirty:0kB
writeback:0kB mapped:1948kB shmem:2508kB slab_reclaimable:14252kB
slab_unreclaimable:14456kB kernel_stack:2368kB pagetables:14712kB
unstable:0kB bounce:0kB free_pcp:392kB local_pcp:120kB free_cma:0kB
writeback_tmp:0kB pages_scanned:76 all_unreclaimable? yes
Feb 27 04:23:05 host kernel: lowmem_reserve[]: 0 0 0 0
Feb 27 04:23:05 host kernel: Node 0 DMA: 8*4kB (UEM) 10*8kB (UM) 6*16kB (UE)
2*32kB (UE) 1*64kB (M) 3*128kB (UEM) 5*256kB (EM) 3*512kB (EM) 2*1024kB (UE)
1*2048kB (E) 0*4096kB = 7632kB
Feb 27 04:23:05 host kernel: Node 0 DMA32: 468*4kB (UE) 276*8kB (UE)
360*16kB (UE) 160*32kB (UEM) 66*64kB (UE) 21*128kB (UE) 3*256kB (UE) 1*512kB
(U) 21*1024kB (EM) 0*2048kB 0*4096kB = 44656kB
Feb 27 04:23:05 host kernel: Node 0 hugepages_total=0 hugepages_free=0
hugepages_surp=0 hugepages_size=2048kB
Feb 27 04:23:05 host kernel: 22708 total pagecache pages
Feb 27 04:23:05 host kernel: 21947 pages in swap cache
Feb 27 04:23:05 host kernel: Swap cache stats: add 5578491, delete 5556313,
find 1588377/2039602
Feb 27 04:23:05 host kernel: Free swap = 0kB
Feb 27 04:23:05 host kernel: Total swap = 978940kB
Feb 27 04:23:05 host kernel: 524174 pages RAM
Feb 27 04:23:05 host kernel: 0 pages HighMem/MovableOnly
Feb 27 04:23:05 host kernel: 53500 pages reserved

I'm sorry for any inconvinience, this is my first bug report, please let me
know if I can provide any additional details.

Best Regards,
Ilya

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeremy Finzel 2019-02-27 14:34:22 Re: Segmentation Fault in logical decoding get/peek API
Previous Message Dean Rasheed 2019-02-27 09:37:11 Re: BUG #15623: Inconsistent use of default for updatable view