Re: How can you get "WAL segment has already been removed" when doing synchronous replication ?!

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Hubert Lubaczewski <depesz(at)depesz(dot)com>
Cc: Raghavendra <raghavendra(dot)rao(at)enterprisedb(dot)com>, PostgreSQL - General <pgsql-general(at)postgresql(dot)org>
Subject: Re: How can you get "WAL segment has already been removed" when doing synchronous replication ?!
Date: 2013-07-12 16:33:54
Message-ID: CAMkU=1w-DG=2ehmsHYSNkNQqkKpj1Mw2s1dGx=Xe67GRzNtsfw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Jul 11, 2013 at 11:01 AM, hubert depesz lubaczewski
<depesz(at)depesz(dot)com> wrote:
> On Thu, Jul 11, 2013 at 11:29:24PM +0530, Raghavendra wrote:
>> On Thu, Jul 11, 2013 at 11:18 PM, hubert depesz lubaczewski <
>> depesz(at)depesz(dot)com> wrote:
>>
>> >
>> > Yet, every now and then we're getting:
>> > FATAL: requested WAL segment * has already been removed
>> >
>> > Assuming no part of the system is issuing "set synchronous_commit
>> > = off", how can we get in such situation?
>> >
>> > Best regards,
>> >
>> > depesz
>> >
>> >
>> Increasing the wal_keep_segments ?
>
> I know that I can increase wal_keep_segments to "solve" it, but
> shouldn't it be *impossible* to happen with synchronous replication?

If a single transaction spans over both log switch boundaries and
checkpoint boundaries (at least two of the later, I think) it is
possible for a file to be recycled before the commit, and hence before
any attempt to synch-to-standby has occured.

> After all - all commits should wait for slave to be 100% up to date!

But if the file isn't there on the sending end, no amount of waiting can help.

It looks like what is needed is to invoke the SyncRepWaitForLSN code
just before log file recycle, as well as upon transaction commit.
I'm not sure why that isn't already done indirectly. Doesn't the
checkpointer insert a WAL record upon completion of a checkpoint
indicating that completion, before any recycling is attempted? Surely
the LSN of that record is higher than that in any file becoming
eligible for recycling. But I guess that that record is not a commit
record, so does not trigger the sync rep.

Cheers,

Jeff

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Merlin Moncure 2013-07-12 19:13:42 Re: transactional swap of tables
Previous Message Tom Lane 2013-07-12 15:59:07 Re: Changing the function used in an index.