From: | Linas Virbalas <linas(dot)virbalas(at)continuent(dot)com> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Florian Pflug <fgp(at)phlo(dot)org> |
Cc: | Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Hot Backup with rsync fails at pg_clog if under load |
Date: | 2011-09-23 12:47:20 |
Message-ID: | CAA25A88.1E7B5%linas.virbalas@continuent.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 9/23/11 12:05 PM, "Heikki Linnakangas"
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>>> It looks to me that pg_clog/0001 exists, but it shorter than recovery
>>> expects. Which shouldn't happen, of course, because the start-backup
>>> checkpoint should flush all the clog that's needed by recovery to disk
>>> before the backup procedure begins to them.
>>
>> Yeah. What confuses me though is that we fail while trying to read from the
>> clog. When do we do that during normal (non-standby) recovery?
>
> I believe the OP is setting up a standby. He didn't say if it's a hot
> standby, though. That changes the startup sequence a bit.
I confirm that it is a Hot Standby with Streaming Replication.
>> One other possibility would be that the problem is somehow triggered by
>> vacuum running while the start-backup checkpoint is commencing. That'd
>> explain why the problem goes away with immediate checkpoints, because those
>> make the windows mach smaller. But I haven't got a concrete theory of whats
>> happening..
>
> Hmm, that's a good theory. The logic in SimpleLruPhysicakReadPage()
> tolerates non-existent files, because the old clog files might've been
> truncated away by a vacuum after the backup started. That's OK, because
> they will be recreated, and later deleted again, during the WAL replay.
> But what if something like this happens:
>
> 0. pg_start_backup().
> 1. rsync starts. It gets a list of all files. It notes that pg_clog/0001
> exists.
> 2. vacuum runs, and deletes pg_clog/0001
> 3. rsync starts to copy pg_clog/0001. Oops, it's gone. Instead of
> leaving it out altogether, it includes it as an empty file.
I cannot confirm that. I have tried this scenario one more time and was
observing how pg_clog/ looked on the master and standby. Here's an overview:
1. pg_start_backup('base_backup') - waiting...
2. Checking the size of pg_clog periodically on the master:
-bash-3.2$ ls -l pg_clog
total 8
-rw------- 1 postgres postgres 8192 Sep 23 14:31 0000
-bash-3.2$ ls -l pg_clog
total 8
-rw------- 1 postgres postgres 8192 Sep 23 14:31 0000
-bash-3.2$ ls -l pg_clog
total 16
3. Somewhere just before (1) step releases and we get into rsync phase, the
pg_clog size changes:
-rw------- 1 postgres postgres 16384 Sep 23 14:33 0000
-bash-3.2$ ls -l pg_clog
total 16
-rw------- 1 postgres postgres 16384 Sep 23 14:33 0000
-bash-3.2$ ls -l pg_clog
total 16
-rw------- 1 postgres postgres 16384 Sep 23 14:33 0000
4. The rsync does transfer the file:
...
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,881 INFO
management.script.ScriptPlugin base/16394/16405_fsm
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,881 INFO
management.script.ScriptPlugin base/16394/pg_internal.init
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/12691
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/12696
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/12697
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/pg_internal.init
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,887 INFO
management.script.ScriptPlugin pg_clog/0000
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,887 INFO
management.script.ScriptPlugin pg_multixact/offsets/0000
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,887 INFO
management.script.ScriptPlugin pg_notify/
...
But on the standby its size is the old one (thus, it seems, that the size
changed after the rsync transfer and before the pg_stop_backup() was
called):
ls -l pg_clog/
total 8
-rw------- 1 postgres postgres 8192 Sep 23 14:31 0000
5. Hence, the server doesn't start and pg_log complains:
2011-09-23 14:33:45 CEST LOG: streaming replication successfully connected
to primary
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: could not access status of transaction
37206
2011-09-23 14:33:46 CEST DETAIL: Could not read from file "pg_clog/0000" at
offset 8192: Success.
6. Now, if I do something that, of course, should never be done, and copy
this 0000 file from master to the standby soon enough (without even
starting/stopping backup), the standby starts up successfully.
--
Hope this helps,
Linas Virbalas
http://flyingclusters.blogspot.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2011-09-23 12:49:25 | Re: Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present |
Previous Message | Robert Haas | 2011-09-23 12:42:42 | Re: [v9.2] make_greater_string() does not return a string in some cases |