Re: Hot Backup with rsync fails at pg_clog if under load

From: Linas Virbalas <linas(dot)virbalas(at)continuent(dot)com>
To: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Florian Pflug <fgp(at)phlo(dot)org>
Cc: Euler Taveira de Oliveira <euler(at)timbira(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Hot Backup with rsync fails at pg_clog if under load
Date: 2011-09-23 12:47:20
Message-ID: CAA25A88.1E7B5%linas.virbalas@continuent.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/23/11 12:05 PM, "Heikki Linnakangas"
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:

>>> It looks to me that pg_clog/0001 exists, but it shorter than recovery
>>> expects. Which shouldn't happen, of course, because the start-backup
>>> checkpoint should flush all the clog that's needed by recovery to disk
>>> before the backup procedure begins to them.
>>
>> Yeah. What confuses me though is that we fail while trying to read from the
>> clog. When do we do that during normal (non-standby) recovery?
>
> I believe the OP is setting up a standby. He didn't say if it's a hot
> standby, though. That changes the startup sequence a bit.

I confirm that it is a Hot Standby with Streaming Replication.

>> One other possibility would be that the problem is somehow triggered by
>> vacuum running while the start-backup checkpoint is commencing. That'd
>> explain why the problem goes away with immediate checkpoints, because those
>> make the windows mach smaller. But I haven't got a concrete theory of whats
>> happening..
>
> Hmm, that's a good theory. The logic in SimpleLruPhysicakReadPage()
> tolerates non-existent files, because the old clog files might've been
> truncated away by a vacuum after the backup started. That's OK, because
> they will be recreated, and later deleted again, during the WAL replay.
> But what if something like this happens:
>
> 0. pg_start_backup().
> 1. rsync starts. It gets a list of all files. It notes that pg_clog/0001
> exists.
> 2. vacuum runs, and deletes pg_clog/0001
> 3. rsync starts to copy pg_clog/0001. Oops, it's gone. Instead of
> leaving it out altogether, it includes it as an empty file.

I cannot confirm that. I have tried this scenario one more time and was
observing how pg_clog/ looked on the master and standby. Here's an overview:

1. pg_start_backup('base_backup') - waiting...

2. Checking the size of pg_clog periodically on the master:

-bash-3.2$ ls -l pg_clog
total 8
-rw------- 1 postgres postgres 8192 Sep 23 14:31 0000
-bash-3.2$ ls -l pg_clog
total 8
-rw------- 1 postgres postgres 8192 Sep 23 14:31 0000
-bash-3.2$ ls -l pg_clog
total 16

3. Somewhere just before (1) step releases and we get into rsync phase, the
pg_clog size changes:

-rw------- 1 postgres postgres 16384 Sep 23 14:33 0000
-bash-3.2$ ls -l pg_clog
total 16
-rw------- 1 postgres postgres 16384 Sep 23 14:33 0000
-bash-3.2$ ls -l pg_clog
total 16
-rw------- 1 postgres postgres 16384 Sep 23 14:33 0000

4. The rsync does transfer the file:

...
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,881 INFO
management.script.ScriptPlugin base/16394/16405_fsm
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,881 INFO
management.script.ScriptPlugin base/16394/pg_internal.init
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/12691
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/12696
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/12697
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,886 INFO
management.script.ScriptPlugin global/pg_internal.init
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,887 INFO
management.script.ScriptPlugin pg_clog/0000
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,887 INFO
management.script.ScriptPlugin pg_multixact/offsets/0000
INFO | jvm 1 | 2011/09/23 14:33:43 | 2011-09-23 14:33:43,887 INFO
management.script.ScriptPlugin pg_notify/
...

But on the standby its size is the old one (thus, it seems, that the size
changed after the rsync transfer and before the pg_stop_backup() was
called):

ls -l pg_clog/
total 8
-rw------- 1 postgres postgres 8192 Sep 23 14:31 0000

5. Hence, the server doesn't start and pg_log complains:

2011-09-23 14:33:45 CEST LOG: streaming replication successfully connected
to primary
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: the database system is starting up
2011-09-23 14:33:46 CEST FATAL: could not access status of transaction
37206
2011-09-23 14:33:46 CEST DETAIL: Could not read from file "pg_clog/0000" at
offset 8192: Success.

6. Now, if I do something that, of course, should never be done, and copy
this 0000 file from master to the standby soon enough (without even
starting/stopping backup), the standby starts up successfully.

--
Hope this helps,
Linas Virbalas
http://flyingclusters.blogspot.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-09-23 12:49:25 Re: Re: [BUGS] BUG #6189: libpq: sslmode=require verifies server certificate if root.crt is present
Previous Message Robert Haas 2011-09-23 12:42:42 Re: [v9.2] make_greater_string() does not return a string in some cases