Re: pg_basebackup, requested WAL has already been removed

From: Lonni J Friedman <netllama(at)gmail(dot)com>
To: Sergey Koposov <koposov(at)ast(dot)cam(dot)ac(dot)uk>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: pg_basebackup, requested WAL has already been removed
Date: 2013-05-10 17:02:54
Message-ID: CAP=oouENQQMXJ1MsDCFerJ-9qN9AQKCFOMMNWOjzaj9uqYKYaA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

That's a good point. Then i dunno, perhaps it is a bug, but I'd be
surprised if this wasn't working, as its not really a corner case that
could be missed in testing, as long as all the options were exercised.
Hopefully someone else can weigh in.

On Fri, May 10, 2013 at 10:00 AM, Sergey Koposov <koposov(at)ast(dot)cam(dot)ac(dot)uk> wrote:
>
> On Fri, 10 May 2013, Lonni J Friedman wrote:
>
>> Its definitely not a bug. You need to set/increase wal_keep_segments
>> to a value that ensures that they aren't recycled faster than the time
>> required to complete the base backup (plus some buffer).
>
>
> But I thought that wal_keep_segments is not needed for the streaming regime
> ( "--xlog-method=stream") And the documentation
> http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html
> only mentions wal_keep_segments when talking about --xlog-method=fetch.
>
>
>>
>> On Fri, May 10, 2013 at 9:48 AM, Sergey Koposov
>
> <koposov(at)ast(dot)cam(dot)ac(dot)uk> wrote: >> Hi,
>>>
>>>
>>> I've recently started to use pg_basebackup --xlog-method=stream to backup
>>> my
>>> multi-Tb database.
>>> Before I did the backup when there was not much activity in the DB and it
>>> went perfectly fine, but today, I've started the backup and it failed
>>> twice
>>> almost at the same time as the
>>> CREATE INDEX (and another time CLUSTER) commands were finished.
>>>
>>> Here:
>>>
>>> postgres(at)cappc118:/mnt/backup/wsdb_130510$ pg_basebackup
>>> --xlog-method=stream --progress --verbose --pg
>>> transaction log start point: 23AE/BD003E70
>>> pg_basebackup: starting background WAL receiver
>>> pg_basebackup: unexpected termination of replication stream: FATAL:
>>> requested WAL segment 00000001000023B1000000FE has already been removed
>>> 4819820/16816887078 kB (4%), 0/1 tablespace
>>> (/mnt/backup/wsdb_130510/base/1)
>>>
>>> And the logs from around that time contained:
>>>
>>> some_user:wsdb:2013-05-10 14:35:41 BST:10587LOG: duration: 40128.163 ms
>>> statement: CREATE INDEX usno_cle
>>> an_q3c_idx ON usno_clean (q3c_ang2ipix(ra,dec));
>>> ::2013-05-10 14:35:43 BST:25529LOG: checkpoints are occurring too
>>> frequently (8 seconds apart)
>>> ::2013-05-10 14:35:43 BST:25529HINT: Consider increasing the
>>> configuration
>>> parameter "checkpoint_segmen
>>> ts".
>>> ::2013-05-10 14:35:51 BST:25529LOG: checkpoints are occurring too
>>> frequently (8 seconds apart)
>>> ::2013-05-10 14:35:51 BST:25529HINT: Consider increasing the
>>> configuration
>>> parameter "checkpoint_segmen
>>> ts".
>>> postgres:[unknown]:2013-05-10 14:35:55 BST:8177FATAL: requested WAL
>>> segment
>>> 00000001000023B1000000FE has already been removed
>>> some_user:wsdb:2013-05-10 14:36:59 BST:10599LOG: duration: 78378.194 ms
>>> statement: CLUSTER usno_clean_q3c_idx ON usno_clean;
>>>
>>> One the previous occasion when it happened the CREATE INDEX() was being
>>> executed:
>>>
>>> some_user:wsdb:2013-05-10 09:17:20 BST:3300LOG: duration: 67.680 ms
>>> statement: SELECT name FROM (SELECT pg_catalog.lower(name) AS name FROM
>>> pg_catalog.pg_settings UNION ALL SELECT 'session authorization' UNION
>>> ALL SELECT 'all') ss WHERE substring(name,1,4)='rand'
>>> LIMIT 1000
>>> ::2013-05-10 09:22:47 BST:25529LOG: checkpoints are occurring too
>>> frequently (18 seconds apart)
>>> ::2013-05-10 09:22:47 BST:25529HINT: Consider increasing the
>>> configuration
>>> parameter "checkpoint_segments".
>>> postgres:[unknown]:2013-05-10 09:22:49 BST:27659FATAL: requested WAL
>>> segment 000000010000239900000040 has already been removed
>>> some_user:wsdb:2013-05-10 09:22:57 BST:3236LOG: duration: 542955.262 ms
>>> statement: CREATE INDEX xmatch_temp_usnoid_idx ON xmatch_temp (usno_id);
>>>
>>> The .configuration
>>> PG 9.2.4, Debian 7.0, amd64
>>>
>>> shared_buffers = 10GB
>>> work_mem = 1GB
>>> maintenance_work_mem = 1GB
>>> effective_io_concurrency = 5
>>> synchronous_commit = off
>>> checkpoint_segments = 32
>>> max_wal_senders = 2
>>> effective_cache_size = 30GB
>>> autovacuum_max_workers = 3
>>> wal_level=archive
>>> archive_mode = off
>>>
>>> Does it look like a bug or am I missing something ?
>>>
>>> Thanks,
>>> Sergey
>>>
>>>
>>> --
>>> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-general
>>
>>
>>
>>
>> --
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> L. Friedman netllama(at)gmail(dot)com
>> LlamaLand https://netllama.linux-sxs.org
>>
>

--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
L. Friedman netllama(at)gmail(dot)com
LlamaLand https://netllama.linux-sxs.org

In response to

Browse pgsql-general by date

  From Date Subject
Next Message David Boreham 2013-05-10 17:03:14 Re: Deploying PostgreSQL on CentOS with SSD and Hardware RAID
Previous Message Sergey Koposov 2013-05-10 17:00:20 Re: pg_basebackup, requested WAL has already been removed