Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Date: 2016-02-10 07:31:34
Message-ID: CAB7nPqQKEK-dynCwenXi9MxwLYF6s7aHv8FGu52aWLZC=mjfPA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Wed, Feb 10, 2016 at 4:11 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote:

> On Wed, Feb 10, 2016 at 12:16 PM, Michael Paquier <
> michael(dot)paquier(at)gmail(dot)com> wrote:
>
>>
>>
>> On Wed, Feb 10, 2016 at 12:41 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
>> wrote:
>> > On Wed, Feb 10, 2016 at 7:17 AM, Michael Paquier <
>> michael(dot)paquier(at)gmail(dot)com>
>> > wrote:
>> >>
>> >
>> > Consider below code:
>> >
>> > /*
>> > + * Get progress before acquiring insert locks to shorten the locked
>> > + * section waiting ahead.
>> > + */
>> > + progress_lsn = GetProgressRecPtr();
>> > +
>> > + /*
>> > * We must block concurrent insertions while examining insert state to
>> > * determine the checkpoint REDO pointer.
>> > */
>> > WALInsertLockAcquireExclusive();
>> >
>> > In GetProgressRecPtr(), first it take WALInsertLocks one-by-one to
>> > retrieve the latest value of progressAt and then again it will take
>> > all the WALInsertLocks in WALInsertLockAcquireExclusive() and then
>> > do some work. Now I think it is okay to retrieve the latest of
>> progressAt
>> > after WALInsertLockAcquireExclusive() as you don't need to take the
>> > locks again.
>>
>> Sure, that would be OK. Now I am not on board if this means to have the
>> checkpoint take an exclusive locks for a bit longer duration if that's not
>> actually necessary.
>>
>>
> Taking and releasing locks again and again (which is done in patch)
> matters much more than adding few instructions under it and I think
> if you would have written the code in-a-way as in patch in some of
> the critical path, it would have been regressed badly, but because
> checkpoint doesn't happen often, reproducing regression is difficult.
> Anyway, there is no-point in too much argument, I think you have
> understood what I wanted to say and if you think the way code is
> currently written is better, then lets leave as it is for somebody else
> to give suggestion on it.
>

Okay.

>
>
>
>>
>> >> > - last_snapshot_lsn != GetXLogInsertRecPtr())
>> >> > +
>> >> > GetLastCheckpointRecPtr() < GetProgressRecPtr())
>> >> >
>> >> > How the above check in patch suppose to work?
>> >> > I think it could so happen once bgwriter logs snapshot, both
>> checkpoint
>> >> > and progressAt won't get updated any further and I think this check
>> will
>> >> > allow to log snapshots in such a case as well.
>> >>
>> >> The only purpose of this check is to do the following thing: if no WAL
>> >> activity has happened since last checkpoint, there is no need to log a
>> >> standby snapshot from the perspective of the bgwriter. In case the
>> >> system is idle, we want to skip logging that and avoid unnecessary
>> >> checkpoints because those records would have been generated. If the
>> >> system is not idle, or to put it in other words there has been at
>> >> least one record since the last checkpoint, we would log a standby
>> >> snapshot, enforcing as well a checkpoint to happen the next time
>> >> CreateCheckpoint() is gone through, and a standby snapshot is logged
>> >> as well after the checkpoint contents are flushed. I am not sure I
>> >> understand what you are getting at...
>> >
>> > Let me try to say again, suppose ControlFile->checkPoint is at
>> > 100 and latest value of progressAt returned by GetProgressRecPtr
>> > is 105, so the first time the above check happens, it will allow
>> > to log standby snapshot which is okay, now assume there is no
>> > activity, neither there is any checkpoint and again after
>> > LOG_SNAPSHOT_INTERVAL_MS interval, when the above check
>> > gets executed, it will pass and log the standby snapshot which is
>> > *not* okay, because we don't want to log standby snapshot when
>> > there is no activity. Am I missing something?
>>
>> Ah, right. Sorry for not getting your point. OK now I got it. So
>> basically what the code does not is that if there is just one record after
>> a checkpoint we would log every 15s a standby snapshot until the next
>> checkpoint even if nothing happens, but we can save a bunch of standby
>> records. So your point is that we need to save the result of
>> GetProgressRecPtr() at each loop in the bgwriter when a standby snapshot is
>> taken, say in a static XLogRecPtr called last_progress_lsn. And then at the
>> next loop we compare it on the current result of GetProgressRecPtr(), so
>> the condition to check if a standby snapshot should be generated or not in
>> the bgwriter becomes that:
>> (now >= timeout &&
>> GetLastCheckpointRecPtr() < current_progress_ptr &&
>> last_progress_ptr < current_progress_ptr)
>>
>
> Why do you think including checkpoint related check is
> required, how about something like below:
>
> (now >= timeout &&
> last_progress_ptr < current_progress_ptr)
>

We need as well to be sure that no standby snapshots are logged just after
a checkpoint in this code path when last_progress_ptr is referring to an
LSN position *before* the last checkpoint LSN. There is already one
snapshot in CreateCheckpoint() that is logged, there is no need of an extra
one, explaining why the check on progress since last checkpoint is
necessary to me.
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Amit Kapila 2016-02-10 08:00:32 Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Previous Message Amit Kapila 2016-02-10 07:11:03 Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2016-02-10 07:39:15 Re: Freeze avoidance of very large table.
Previous Message Amit Kapila 2016-02-10 07:11:03 Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby