Re: Checkpoint cost, looks like it is WAL/CRC

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: josh(at)agliodbs(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Checkpoint cost, looks like it is WAL/CRC
Date: 2005-07-26 21:56:19
Message-ID: 1122414979.3670.96.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2005-07-22 at 19:11 -0400, Tom Lane wrote:
> Hmm. Eyeballing the NOTPM trace for cases 302912 and 302909, it sure
> looks like the post-checkpoint performance recovery is *slower* in
> the latter. And why is 302902 visibly slower overall than 302905?
> I thought for a bit that you had gotten "patch" vs "no patch" backwards,
> but the oprofile results linked to these pages look right: XLogInsert
> takes significantly more time in the "no patch" cases.
>
> There's something awfully weird going on here. I was prepared to see
> no statistically-significant differences, but not multiple cases that
> seem to be going the "wrong direction".

All of the tests have been performed with wal_buffers = 8, so there will
be massive contention for those buffers, leading to increased I/O...

All of the tests show that there is a CPU utilisation drop, and an I/O
wait increase immediately following checkpoints.

When we advance the insert pointer and a wal_buffer still needs writing,
we clean it by attempting to perform an I/O while holding WALInsertLock.
Very probably the WALWriteLock is currently held, so we wait on the
WALWriteLock and everybody else waits on us. Normally, its fairly hard
for that to occur since we attempt to XLogWrite when walbuffers are more
than half full, but we do this with a conditional acquire, so when we're
busy we just keep filling up wal_buffers. Normally, thats OK.

When we have a checkpoint, almost every xlog write has at least a whole
block appended to it. So we can easily fill up wal_buffers very quickly
while WALWriteLock is held. Once there is no space available, we then
effectively halt all transactions while we write out that buffer.

My conjecture is that the removal of the CPU bottleneck has merely moved
the problem by allowing users to fill wal buffers faster and go into a
wait state quicker than they did before. The beneficial effect of the
conditional acquire when wal buffers is full never occurs, and
performance drops.

We should run tests with much higher wal_buffers numbers to nullify the
effect described above and reduce contention. That way we will move
towards the log disk speed being the limiting factor, patch or no patch.

So, I think Tom's improvement of CRC/hole compression will prove itself
when we have higher values of wal_buffers,

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Browne 2005-07-26 21:56:42 Interesting COPY edge case...
Previous Message Dave Page 2005-07-26 21:55:15 Re: For review: Server instrumentation patch