From: | Craig James <cjames(at)emolecules(dot)com> |
---|---|
To: | "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org> |
Subject: | pg_basebackup bug: base backup is double the size of the database |
Date: | 2015-01-21 17:32:06 |
Message-ID: | CAFwQ8reMSPCeWH+n_LrZeLg=auQh4LiypKhcYGaROdnxet9mYw@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
We've encountered a serious bug with pg_basebackup. It seems to be
following hard links and duplicating all files in the tablespaces rather
than preserving links.
Drilling down into one specific tablespace, we find this:
# ls -l /data/postgres-9.3/main/pg_tblspc/16747
lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28
/data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/
# du -sh /data/postgres-9.3/tablespaces/uorsy
*35G* /data/postgres-9.3/tablespaces/uorsy
# du -sh /data/postgres-9.3/tablespaces/uorsy/*
*35G* /data/postgres-9.3/tablespaces/uorsy/8208624
*8.1M* /data/postgres-9.3/tablespaces/uorsy/PG_9.3_201306121
4.0K /data/postgres-9.3/tablespaces/uorsy/pgsql_tmp
4.0K /data/postgres-9.3/tablespaces/uorsy/PG_VERSION
# find /data/postgres-9.3/tablespaces/uorsy \! -links 1 -type f | wc -l
*740*
In other words, this tablespace has 35G of real data, plus 740 hard links
that effectively duplicate each data file.
When we look at the same data in the archive that pg_basebackup creates
(invoked via barman), we find this:
# du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
*70G* /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
# du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/*
*35G*
/pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/8208624
*35G*
/pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_9.3_201306121
4.0K
/pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/pgsql_tmp
4.0K
/pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_VERSION
# find /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747 \!
-links 1 -type f | wc -l
*0*
That is, no hard links, and all of the data files are duplicated. And of
course, when we try to actually use this archive to recover, it's twice the
size as the original database and doesn't fit on our disks.
My guess is that pg_basebackup is using (or doing the equivalent of)
rsync(1) without the --hard-links option, and that these hard links were
created by pg_upgrade when we went from 8.4.17 to 9.3.5.
What can we do to fix this? The whole cluster is about 350 databases and
800GB.
Thanks,
Craig
From | Date | Subject | |
---|---|---|---|
Next Message | Craig James | 2015-01-21 17:45:12 | Re: pg_basebackup bug: base backup is double the size of the database |
Previous Message | Sergey Arlashin | 2015-01-21 15:21:43 | Re: PostgreSQL 9.3 synchronous replication |