Re: streaming replication, "frozen snapshot backup on it" and missing relfile (postgres 9.2.3 on xfs + LVM)

From: Benedikt Grundmann <bgrundmann(at)janestreet(dot)com>
To: David Powers <dpowers(at)janestreet(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: streaming replication, "frozen snapshot backup on it" and missing relfile (postgres 9.2.3 on xfs + LVM)
Date: 2013-05-28 14:53:59
Message-ID: CADbMkNPpn6HB9DXu=0zvq34-uC1cPu6AO=dc6A0OSsnG1dZXXQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Today we have seen

2013-05-28 04:11:12.300 EDT,,,30600,,51a41946.7788,1,,2013-05-27 22:41:10
EDT,,0,ERROR,XX000,"xlog flush request 1E95/AFB2DB10 is not satisfied ---
flushed only to 1E7E/21CB79A0",,,,,"writing block 9 of relation
base/16416/293974676",,,,""
2013-05-28 04:11:13.316 EDT,,,30600,,51a41946.7788,2,,2013-05-27 22:41:10
EDT,,0,ERROR,XX000,"xlog flush request 1E95/AFB2DB10 is not satisfied ---
flushed only to 1E7E/21CB79A0",,,,,"writing block 9 of relation
base/16416/293974676",,,,""

while taking the *backup of the primary*. We have been running for a few
days like that and today is the first day where we see these problems
again. So it's not entirely deterministic / we don't know yet what we have
to do to reproduce.

So this makes Robert's theory more likely. However we have also using this
method (LVM + rsync with hardlinks from primary) for years without these
problems. So the big question is what changed?

One hypothesis is that it is related to the primary being in hot_standby
instead of minimal or archive wal_method (which we used before we switched
to 9.2).

Here are the entries in the log related to the startup of the corrupt
testing cluster:

2013-05-27 22:41:10.029 EDT,,,30598,,51a41946.7786,1,,2013-05-27 22:41:10
EDT,,0,LOG,00000,"database system was interrupted; last known up at
2013-05-26 21:01:09 EDT",,,,,,,,,""
2013-05-27 22:41:10.029 EDT,,,30599,"",51a41946.7787,1,"",2013-05-27
22:41:10 EDT,,0,LOG,00000,"connection received: host=172.27.65.204
port=55279",,,,,,,,,""
2013-05-27 22:41:10.030 EDT,,,30598,,51a41946.7786,2,,2013-05-27 22:41:10
EDT,,0,LOG,00000,"database system was not properly shut down; automatic
recovery in progress",,,,,,,,,""
2013-05-27 22:41:10.030 EDT,"as-elephant","postgres",30599,"
172.27.65.204:55279",51a41946.7787,2,"",2013-05-27 22:41:10
EDT,,0,FATAL,57P03,"the database system is starting up",,,,,,,,,""
2013-05-27 22:41:10.031 EDT,,,30598,,51a41946.7786,3,,2013-05-27 22:41:10
EDT,,0,LOG,00000,"redo starts at 1E7E/2152B178",,,,,,,,,""
2013-05-27 22:41:10.094 EDT,,,30598,,51a41946.7786,4,,2013-05-27 22:41:10
EDT,,0,LOG,00000,"record with zero length at 1E7E/215AC6B8",,,,,,,,,""
2013-05-27 22:41:10.094 EDT,,,30598,,51a41946.7786,5,,2013-05-27 22:41:10
EDT,,0,LOG,00000,"redo done at 1E7E/215AC688",,,,,,,,,""
2013-05-27 22:41:10.094 EDT,,,30598,,51a41946.7786,6,,2013-05-27 22:41:10
EDT,,0,LOG,00000,"last completed transaction was at log time 2013-05-26
21:09:08.06351-04",,,,,,,,,""
2013-05-27 22:41:10.134 EDT,,,30595,,51a41945.7783,1,,2013-05-27 22:41:09
EDT,,0,LOG,00000,"database system is ready to accept connections",,,,,,,,,""
2013-05-27 22:41:10.134 EDT,,,30603,,51a41946.778b,1,,2013-05-27 22:41:10
EDT,,0,LOG,00000,"autovacuum launcher started",,,,,,,,,""
2013-05-27 22:41:15.037 EDT,,,30608,"",51a4194b.7790,1,"",2013-05-27
22:41:15 EDT,,0,LOG,00000,"connection received: host=172.27.65.204
port=55283",,,,,,,,,""

This means we currently do NOT have a way to make backups that we trust.
We are very open to any suggestions of any alternative methods we should
consider using. The database is of non trivial size by now:

proddb=> select pg_size_pretty(pg_database_size('proddb'));
pg_size_pretty
----------------
1294 GB
(1 row)

The backup script itself is by now a rather long OCaml program, so I doubt
the value in posting it to this list. But here is the log of what it did
which should be pretty explanatory:

proddb backup: starting
proddb backup: /bin/bash -c "/usr/bin/ssh -l root tot-dbc-001 ls -d
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb'.*-*-*.*-*-*'"
(enqueued)
proddb backup: /bin/bash -c "/usr/bin/ssh -l root tot-dbc-001 ls -d
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb'.*-*-*.*-*-*'" (running
as pid: [23422])
proddb backup: /bin/bash -c "/usr/bin/ssh -l root tot-dbc-001 ls -d
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb'.*-*-*.*-*-*'" ([23422]
exited normally)
proddb backup: /bin/mkdir -p
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress (enqueued)
proddb backup: /bin/mkdir -p
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress (running as
pid: [23433])
proddb backup: /bin/mkdir -p
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress ([23433]
exited normally)
proddb backup: /bin/bash -c "/usr/bin/ssh -l root tot-dbc-001
/mnt/global/base/bin/db tools backup backup -v -src '"((dbname proddb)
(hostname tot-dbc-001) (superuser postgres_prod) (basedir /database)
(version 9.2))"' -dst '"((username postgres) (hostname 127.0.0.1)
(backup_dir
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress)
(last_backup_dir (..//proddb.2013-05-26.20-00-00)
proddb backup: /bin/bash -c "/usr/bin/ssh -l root tot-dbc-001
/mnt/global/base/bin/db tools backup backup -v -src '"((dbname proddb)
(hostname tot-dbc-001) (superuser postgres_prod) (basedir /database)
(version 9.2))"' -dst '"((username postgres) (hostname 127.0.0.1)
(backup_dir
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress)
(last_backup_dir (..//proddb.2013-05-26.20-00-00)
proddb backup [23437]: 2013-05-27 20:00:02.232864-04:00 Debug backing up
tot-dbc-001:proddb to ((username postgres) (hostname 127.0.0.1)
proddb backup [23437]: (backup_dir
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress)
proddb backup [23437]: (last_backup_dir (..//proddb.2013-05-26.20-00-00)))
proddb backup [23437]: 2013-05-27 20:00:02.232869-04:00 Debug running
/usr/pgsql-9.2/bin/psql -U postgres_prod -h tot-dbc-001 -d postgres -c
checkpoint
proddb backup [23437]: 2013-05-27 20:00:02.459035-04:00 Debug Running
preliminary rsync outside of snapshot to minimize its lifetime.
proddb backup [23437]: 2013-05-27 20:00:02.459049-04:00 Debug running rsync
from /database/ to postgres(at)127(dot)0(dot)0(dot)1:
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress//
proddb backup [23437]: 2013-05-27 20:00:02.459053-04:00 Debug running nice
-n 19 rsync --link-dest ..//proddb.2013-05-26.20-00-00 -a -q --delete
/database/ postgres(at)127(dot)0(dot)0(dot)1:
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress//
proddb backup [23437]: 2013-05-27 21:43:05.742036-04:00 Debug Taking
snapshot of database
proddb backup [23437]: 2013-05-27 21:43:05.742056-04:00 Debug finding LVM
device for /database
proddb backup [23437]: 2013-05-27 21:43:05.796354-04:00 Debug real path of
/database is /mnt/dbc01-d1/proddb
proddb backup [23437]: 2013-05-27 21:43:05.797222-04:00 Debug /database is
located on LVM device /dev/dbc01/d1 at relative path /proddb
proddb backup [23437]: 2013-05-27 21:43:05.797237-04:00 Debug running
lvcreate --size 500G --snapshot --name proddb-snapshot /dev/dbc01/d1
proddb backup [23437]: 2013-05-27 21:43:06.487274-04:00 Debug running mount
-o nouuid,ro /dev/dbc01/proddb-snapshot /tmp/snapfchjHv
proddb backup [23437]: 2013-05-27 21:43:06.682700-04:00 Debug running rsync
from /tmp/snapfchjHv/proddb/ to postgres(at)127(dot)0(dot)0(dot)1:
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress//
proddb backup [23437]: 2013-05-27 21:43:06.682710-04:00 Debug running nice
-n 19 rsync --inplace --link-dest ..//proddb.2013-05-26.20-00-00 -a -q
--delete /tmp/snapfchjHv/proddb/ postgres(at)127(dot)0(dot)0(dot)1:
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress//
proddb backup [23437]: 2013-05-27 22:38:45.817805-04:00 Debug Deleting
snapshot of database
proddb backup [23437]: 2013-05-27 22:38:45.817824-04:00 Debug running
umount /tmp/snapfchjHv
proddb backup [23437]: 2013-05-27 22:38:47.096200-04:00 Debug removing
mount point /tmp/snapfchjHv
proddb backup [23437]: 2013-05-27 22:38:47.096732-04:00 Debug running
lvremove -f /dev/dbc01/proddb-snapshot
proddb backup [23437]: 2013-05-27 22:38:47.532788-04:00 Debug running
lvdisplay /dev/dbc01/proddb-snapshot
proddb backup: /bin/bash -c "/usr/bin/ssh -l root tot-dbc-001
/mnt/global/base/bin/db tools backup backup -v -src '"((dbname proddb)
(hostname tot-dbc-001) (superuser postgres_prod) (basedir /database)
(version 9.2))"' -dst '"((username postgres) (hostname 127.0.0.1)
(backup_dir
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress)
(last_backup_dir (..//proddb.2013-05-26.20-00-00)
proddb backup: /usr/bin/ssh postgres(at)tot-dbc-001 cd
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress (enqueued)
proddb backup: /usr/bin/ssh postgres(at)tot-dbc-001 cd
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress (running as
pid: [14318])
proddb backup: /usr/bin/ssh postgres(at)tot-dbc-001 cd
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress ([14318]
exited normally)
proddb backup: /usr/bin/ssh postgres(at)tot-dbc-001 mv
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.2013-05-27.20-00-00
(enqueued)
proddb backup: /usr/bin/ssh postgres(at)tot-dbc-001 mv
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.2013-05-27.20-00-00
(running as pid: [14319])
proddb backup: /usr/bin/ssh postgres(at)tot-dbc-001 mv
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.in-progress
/net/nyc-isilon1/ifs/data/backup-dbc//proddb/proddb.2013-05-27.20-00-00
([14319] exited normally)
proddb backup: finished; status = ok

Thanks in advance,

Bene

On Thu, May 23, 2013 at 5:30 PM, David Powers <dpowers(at)janestreet(dot)com>wrote:

> Thanks for the response.
>
> I have some evidence against an issue in the backup procedure (though I'm
> not ruling it out). We moved back to taking the backup off of the primary
> and all errors for all three clusters went away. All of the hardware is
> the same, OS and postgres versions are largely the same (9.2.3 vs. 9.2.4 in
> some cases, various patch levels of Cent 6.3 for the OS). The backup code
> is exactly the same, just pointed at a different set of boxes.
>
> Currently I'm just running for a couple of days to ensure that we have
> viable static backups. After that I'll redo one of the restores from a
> suspected backup and will post the logs.
>
> -David
>
>
> On Thu, May 23, 2013 at 11:26 AM, Robert Haas <robertmhaas(at)gmail(dot)com>wrote:
>
>> On Tue, May 21, 2013 at 11:59 AM, Benedikt Grundmann
>> <bgrundmann(at)janestreet(dot)com> wrote:
>> > We are seeing these errors on a regular basis on the testing box now.
>> We
>> > have even changed the backup script to
>> > shutdown the hot standby, take lvm snapshot, restart the hot standby,
>> rsync
>> > the lvm snapshot. It still happens.
>> >
>> > We have never seen this before we introduced the hot standby. So we
>> will
>> > now revert to taking the backups from lvm snapshots on the production
>> > database. If you have ideas of what else we should try / what
>> information
>> > we can give you to debug this let us know and we will try to so.
>> >
>> > Until then we will sadly operate on the assumption that the combination
>> of
>> > hot standby and "frozen snapshot" backup of it is not production ready.
>>
>> I'm pretty suspicious that your backup procedure is messed up in some
>> way. The fact that you got invalid page headers is really difficult
>> to attribute to a PostgreSQL bug. A number of the other messages that
>> you have posted also tend to indicate either corruption, or that WAL
>> replay has stopped early. It would be interesting to see the logs
>> from when the clone was first started up, juxtaposed against the later
>> WAL flush error messages.
>>
>> --
>> Robert Haas
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>>
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Merlin Moncure 2013-05-28 15:03:56 Re: Planning incompatibilities for Postgres 10.0
Previous Message Andres Freund 2013-05-28 14:37:40 Re: background worker and normal exit