Re: BUG #15660: pg_dump memory leaks when dumping LOBs

From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: iserbin(at)bostonsd(dot)ru, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>
Subject: Re: BUG #15660: pg_dump memory leaks when dumping LOBs
Date: 2022-09-16 08:59:45
Message-ID: 7da8823d83a2b66bdd917aa6cb2c5c2619d86011.camel@credativ.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

Am Donnerstag, dem 28.02.2019 um 12:30 -0500 schrieb Tom Lane:
> Yeah.  We've heard about this repeatedly since large objects were
> promoted to be full-fledged objects with individual owners and ACL
> info.  It was argued at the time that large objects would be, well,
> large, and therefore there wouldn't be *that* many of them, and so
> it'd be okay for pg_dump to treat them as full-fledged objects with
> their own dump-table-of-contents entries.  Evidently, not everybody
> uses them that way :-(.

To give another data point, we also encountered such a problem at a
customer recently.

They have around 70 tables and a database size of 24 GB, but over 20
million rows in pg_largeobject. The directory dump they are using on an
XFS file system has 14,6 million files in it, leading to a size of the
directory inode of over 450 MB:

|drwx------+ 2 postgres postgres 454M 14. Sep 07:10 xxxxxxxxxtst-20220913

The TOC file is larger than 800 MB and has ca. 14,6 million lines, all
but around 700 of which are BLOB entries.

Dumping with pg_dump -d -j2 leads to frequent out of memory situations
on this node due to the large amount of memory pg_dump needs (and also
because their TSM client apparently also cannot cope with so many files
and takes up 20% of the RAM as well):

|postgres 43844 73.9 8.8 11711000 11680472 ? R 00:27 297:00 /srv/[...]/bin/pg_dump -Fd -b -f [...]/xxxxxxxxxtst-20220913 -j 2 -Z 1 XXXXXXXXXTST
|postgres 43832 0.0 8.8 11711000 11680236 ? S 00:27 0:03 /srv/[...]/bin/pg_dump -Fd -b -f [...]/xxxxxxxxxtst-20220913 -j 2 -Z 1 XXXXXXXXXTST
|postgres 38251 0.3 8.8 11710680 11683100 ? S 00:21 1:22 /srv/[...]/bin/pg_dump -Fd -b -f [...]/xxxxxxxxxtst-20220913 -j 2 -Z 1 XXXXXXXXXTST

> At some point I think we're going to have to revisit that decision in
> pg_dump. Maybe we could merge things so that we have one TOC entry
> for all blobs that share the same owner and ACL? But it comes up
> seldom enough that nobody's gotten motivated to do the work.

Yeah, it would be good if something could be done about this.

> In the short run, the only answer is to run pg_dump on beefier iron,
> or else split up the dump as Haribabu suggests (although TBH I'm
> afraid that might not work either --- if memory serves, pg_dump tends
> to pull in all the per-object info even if it's not going to dump it
> all ...)

In this case, we will try to advise the client to not use large objects
if possible (this is only a test instance so far).

Michael

--
Michael Banck
Teamleiter PostgreSQL-Team
Projektleiter
Tel.: +49 2166 9901-171
E-Mail: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Geoff Richardson, Peter Lilley

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2022-09-16 10:00:24 BUG #17616: Silently skipping to update data to the database with owner privileges using flyway scripts
Previous Message Tom Lane 2022-09-16 03:09:10 Re: BUG #17615: Getting error while inserting records in the table: invalid byte sequence for encoding "UTF8": 0xae