Re: Strange replication problem - segment restored from archive but still requested from master

From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Piotr Gasidło <quaker(at)barbara(dot)eu(dot)org>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Strange replication problem - segment restored from archive but still requested from master
Date: 2015-05-25 15:35:20
Message-ID: CAECtzeU7SRhHSWFTrakdd6Nh8=r4kys5ugQJHu_2QNxLgb7+1g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

2015-05-25 15:15 GMT+02:00 Piotr Gasidło <quaker(at)barbara(dot)eu(dot)org>:

> 2015-05-25 11:30 GMT+02:00 Guillaume Lelarge <guillaume(at)lelarge(dot)info>:
>
> >> I currently have wal_keep_segments set to 0.
> >> Setting this to higher value will help? As I understand: master won't
> >> delete segment and could stream it to slave on request - so it will
> >> help.
> >
> >
> > It definitely helps, but the issue could still happen.
> >
>
> What conditions must be met for issue to happen?
>

Very high WAL traffic can make the slave lag enough that even
wal_keep_segments isn't enough.

Both archive_command on master and restore_commands are set and working.
> Also wal_keep_segments is set.
>
> I see no point of failure - only delay in the case of high WAL traffic
> on master:
> - slave starts with restoring WALs from archive,
> - now, it connects to master and notices, that for last master's WAL
> it needs previous one ("the issue"),
> - slave asks master for previous WAL and gets it - job done, streaming
> replication set, exit
> - if unable to get it (WAL traffic is high, and after restoring last
> WAL from archive and asking master for next one more than
> wal_keep_segments were recycled) it returns to looking WALs in
> archive.
>
> Do I get it right?
>
>
Yes. If you set correctly archive_command (on the master) and
restore_command (on the slave), there's no point of failure. You might
still get the "WAL not available" error message, but the slave can
synchronize itself with the archived WALs.

--
Guillaume.
http://blog.guillaume.lelarge.info
http://www.dalibo.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Melvin Davidson 2015-05-25 16:25:01 Re: Queries for unused/useless indexes
Previous Message Peter J. Holzer 2015-05-25 14:41:05 Re: Queries for unused/useless indexes

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-05-25 15:35:21 Buggy logic in nodeIndexscan.c queue handling
Previous Message Stephen Frost 2015-05-25 15:34:38 Re: [CORE] [BUGS] BUG #13350: blindly fsyncing data dir considered harmful