Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Date: 2016-02-02 04:42:25
Message-ID: CAA4eK1LbbWPiwzP3NMEVijFb_PymQB6GyTDTgSyaDOpHZqm69g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Sat, Jan 30, 2016 at 7:38 PM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:
>
> On Fri, Jan 29, 2016 at 9:13 PM, Michael Paquier wrote:
> > Well, to put it short, I am just trying to find a way to make the
> > backend skip unnecessary checkpoints on an idle system, which results
> > in the following WAL pattern if system is completely idle:
> > CHECKPOINT_ONLINE
> > RUNNING_XACTS
> > RUNNING_XACTS
> > [etc..]
> >
> > The thing is that I am lost with the meaning of this condition to
> > decide if a checkpoint should be skipped or not:
> > if (prevPtr == ControlFile->checkPointCopy.redo &&
> > prevPtr / XLOG_SEG_SIZE == curInsert / XLOG_SEG_SIZE)
> > {
> > WALInsertLockRelease();
> > LWLockRelease(CheckpointLock);
> > return;
> > }
> > As at least one standby snapshot is logged before the checkpoint
> > record, the redo position is never going to match the previous insert
> > LSN, so checkpoints will never be skipped if wal_level >= hot_standby.
> > Skipping such unnecessary checkpoints is what you would like to
> > address, no? Because that's what I would like to do here first. And
> > once we got that right, we could think about addressing the case where
> > WAL segments are forcibly archived for idle systems.
>
> I have put a bit more of brain power into that, and finished with the
> patch attached. A new field called chkpProgressLSN is attached to
> XLogCtl, which is to the current insert position of the checkpoint
> when wal_level <= archive, or the LSN position of the standby snapshot
> taken after a checkpoint. The bgwriter code is modified as well so as
> it uses this progress LSN and compares it with the current insert LSN
> to determine if a standby snapshot should be logged or not. On an
> instance of Postgres completely idle, no useless checkpoints, and no
> useless standby snapshots are generated anymore.
>

@@ -8239,7 +8262,7 @@ CreateCheckPoint(int flags)
if ((flags & (CHECKPOINT_IS_SHUTDOWN |
CHECKPOINT_END_OF_RECOVERY |
CHECKPOINT_FORCE)) == 0)
{
- if
(prevPtr == ControlFile->checkPointCopy.redo &&
+ if (GetProgressRecPtr() == prevPtr &&

prevPtr / XLOG_SEG_SIZE == curInsert / XLOG_SEG_SIZE)
{

I think such a check won't consider all the WAL-activity happened
during checkpoint (after checkpoint start, till it finishes) which was
not the intention of this code without your patch.

I think both this and previous patch (hs-checkpoints-v1 ) approach
won't fix the issue in all kind of scenario's.

Let me try to explain what I think should fix this issue based on
discussion above, suggestions by Andres and some of my own
thoughts:

Have a new variable in WALInsertLock that would indicate the last
insertion position (last_insert_pos) we write to after holding that lock.
Ensure that we don't update last_insert_pos while logging standby
snapshot (simple way is to pass a flag in XLogInsert or distinguish
it based on type of record (XLOG_RUNNING_XACTS) or if you can
think of a more smarter way). Now both during checkpoint and
in bgwriter, to find out whether there is any activity since last
time, we need to check all the WALInsertLocks for latest insert position
(by referring last_insert_pos) and compare it with the latest position
we have noted during last such check and decide whether to proceed
or not based on comparison result. If you think it is not adequate to
store last_insert_pos in WALInsertLock, then we can think of having
it in PGPROC.

Yet another idea that occurs to me this morning is that why not
have a variable in shared memory in XLogCtlInsert or XLogCtl
similar to CurrBytePos/PrevBytePos which will be updated on
each XLOGInsert apart from the XLOGInsert for standby snapshots
and use that in a patch somewhat close to what you have in
hs-checkpoints-v1.

One related point is don't we need to bother about activity on
unlogged relations for checkpoints to occur, considering the
case when the only activity happened after last checkpoint
is on unlogged relations?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message petrum 2016-02-02 07:47:44 BUG #13905: Inconsistent code modification
Previous Message David G. Johnston 2016-02-01 15:13:11 Re: [BUGS] BUG #13904: Postgres “cannot assign non-composite value to a row variable” error assigning a composite value

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-02-02 04:43:03 Re: postgres_fdw join pushdown (was Re: Custom/Foreign-Join-APIs)
Previous Message Kouhei Kaigai 2016-02-02 04:39:54 Re: Way to check whether a particular block is on the shared_buffer?