Re: [BUG] Checkpointer on hot standby runs without looking checkpoint_segments

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: masao(dot)fujii(at)gmail(dot)com
Cc: heikki(dot)linnakangas(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [BUG] Checkpointer on hot standby runs without looking checkpoint_segments
Date: 2012-05-14 02:36:59
Message-ID: 20120514.113659.259442748.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello, I've returned from my overseas trip for just over one week.

# and I'll leave Japan again after this...

> >     restorePtr <= replayPtr <= receivePtr
> >
> > But XLByteLT(recievePtr, replayPtr) this should not return true
> > under the condition above.. Something wrong in my assumption?
>
> When walreceiver is not running, i.e., the startup process reads the WAL files
> from the archival area, the replay location would get bigger than the
> receive one.

I've overlooked that startup process of the standby reads
archives first, and then WAL. But the current patch enables
progress governing based on checkpoint_segments during archive
recovery on the standby.

At Wed, 25 Apr 2012 02:20:37 +0900, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote in <CAHGQGwFSo5WFptNALxmE-ozRQq6Kk24XgTYbvJjHdYtJf9fdOg(at)mail(dot)gmail(dot)com>
> To alleviate the above problem, at least we might have to
> change the recovery logic so that the archived WAL files are
> restored with a temporary name, if cascading replication is not
> enabled (i.e., !standby_mode || !hot_standby || max_wal_senders
> <= 0). Or we might have to remove the restored WAL file after
> replaying it and before opening the next one, without waiting
> for a restartpoint to remove the restored WAL files. Thought?

I think it is beyond a bug fix. Furthermore, this is not a
problem of speed of restartpoint progression, I suppose. If so,
this should be cared as another problem.

Putting aside the problem of vast amount of copied WALs, I
suppose the remaining problem is needless restartpoint
acceleration caused during archive restoration on the standby.
I will try to resolve this problem. Is that OK?

Thinking about the so-many-copied-WAL problem, IMHO, using
temporary name only for non-cascading is not a good solution
because it leads complication and retrogression to the code and
behavior, nevertheless it won't solve the half of the problem. I
don't yet understand about cascading replication enough, but I
suppose erasing WALs as becoming out of use (by some logic I
don't find yet) is hopeful.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Qi Huang 2012-05-14 02:42:19 Re: Gsoc2012 idea, tablesample
Previous Message Robert Haas 2012-05-13 23:45:48 Re: Why do we still have commit_delay and commit_siblings?