Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: hlinnaka <hlinnaka(at)iki(dot)fi>
Cc: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, Simon Riggs <simon(at)2ndquadrant(dot)com>, digoal zhou <digoal(dot)zhou(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Let PostgreSQL's On Schedule checkpoint write buffer smooth spread cycle by tuning IsCheckpointOnSchedule?
Date: 2015-07-06 03:30:50
Message-ID: CAA4eK1LNhU9oteZZN=4c8ZhXa-hiJUMZB60NoGh8Psi_spaVtg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jul 5, 2015 at 1:18 PM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:

> On 07/04/2015 07:34 PM, Fabien COELHO wrote:
>
>> I have ran some tests with this patch and the detailed results of the
>>> runs are attached with this mail.
>>>
>>
>> I do not understand really the aggregated figures in the files attached.
>>
>
> Me neither. It looks like Amit measured the time spent in mdread and
> mdwrite, but I'm not sure what conclusions one can draw from that.

As Heikki has pointed, it is stats data for mdread and mdwrite
between the checkpoints (in the data, you need to search for
"checkpoint start"/"checkpoint done"). In between checkpoint
start and checkpoint done, all the data shows the amount of read/
write done (I am just trying to reproduce what Digoal has reported, so
I am using his script and I also don't understand every thing, but I think
we can look at count between checkpoints to deduce whether the IO
is flattened after patch). Digoal was seeing a spike at the beginning of
checkpoint (after checkpoint start) in his configuration without this patch
and the spike seems to be reduced after this patch where as in my tests
I don't see the spike immediately after checkpoint (although there are some
spikes in-between) even without patch which means that either I might not
be using the right configuration to measure the IO or there is some other
difference between the way Digoal ran the test and I ran the tests. I have
done
the setup (even though hardware will not be same, but at least I can run the
tests and collect the data in the format similar to Digoal), so if you guys
have
suggestions about which kind of parameters we should tweek or some tests
to gather the results, I can do that present the results here for further
discussion.

> I thought the patch should show difference if I keep max_wal_size to
>>> somewhat lower or moderate value so that checkpoint should get triggered
>>> due to wal size, but I am not seeing any major difference in the writes
>>> spreading.
>>>
>>
>> I'm not sure I understand your point. I would say that at full speed
>> pgbench the disk is always busy writing as much as possible, either
>> checkpoint writes or wal writes, so the write load as such should not be
>> that different anyway?
>>
>> I understood that the point of the patch is to check whether there is a
>> tps dip or not when the checkpoint begins, but I'm not sure how this can
>> be infered from the many aggregated data you sent, and from my recent
>> tests the tps is very variable anyway on HDD.
>
>
Yes, we definitely want to see the effect on TPS at the beginning of
checkpoint,
but even measuring the IO during checkpoint with the way Digoal was
capturing
the data can show the effect of this patch.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2015-07-06 05:03:13 Re: Inconsistent style in pgbench's error messages
Previous Message David Rowley 2015-07-06 02:36:46 Re: Memory Accounting v11