Re: Redesigning checkpoint_segments

From: Venkata Balaji N <nag1010(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter_e(at)gmx(dot)net>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Redesigning checkpoint_segments
Date: 2015-02-23 06:21:26
Message-ID: CAEyp7J8OnQf2RaRH7kYpGPtWp6HvgV2p6+n0E0tokCKpNBXVXw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Feb 14, 2015 at 4:43 AM, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com
> wrote:

> On 02/04/2015 11:41 PM, Josh Berkus wrote:
>
>> On 02/04/2015 12:06 PM, Robert Haas wrote:
>>
>>> On Wed, Feb 4, 2015 at 1:05 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>>
>>>> Let me push "max_wal_size" and "min_wal_size" again as our new parameter
>>>> names, because:
>>>>
>>>> * does what it says on the tin
>>>> * new user friendly
>>>> * encourages people to express it in MB, not segments
>>>> * very different from the old name, so people will know it works
>>>> differently
>>>>
>>>
>>> That's not bad. If we added a hard WAL limit in a future release, how
>>> would that fit into this naming scheme?
>>>
>>
>> Well, first, nobody's at present proposing a patch to add a hard limit,
>> so I'm reluctant to choose non-obvious names to avoid conflict with a
>> feature nobody may ever write. There's a number of reasons a hard limit
>> would be difficult and/or undesirable.
>>
>> If we did add one, I'd suggest calling it "wal_size_limit" or something
>> similar. However, we're most likely to only implement the limit for
>> archives, which means that it might acually be called
>> "archive_buffer_limit" or something more to the point.
>>
>
> Ok, I don't hear any loud objections to min_wal_size and max_wal_size, so
> let's go with that then.
>
> Attached is a new version of this. It now comes in four patches. The first
> three are just GUC-related preliminary work, the first of which I posted on
> a separate thread today.
>

I applied all the 4 patches to the latest master successfully and performed
a test with heavy continuous load. I see no much difference in the
checkpoint behaviour and all seems to be working as expected.

I did a test with following parameter values -

max_wal_size = 10000MB
min_wal_size = 1000MB
checkpoint_timeout = 5min

Upon performing a heavy load operation, the checkpoints were occurring
based on timeouts.

pg_xlog size fluctuated a bit (not very much). Initially few mins pg_xlog
size stayed at 3.3G and gradually increased to 5.5G max during the
operation. There was a continuous fluctuation on number of segments being
removed+recycled.

A part of the checkpoint logs are as follows -

2015-02-23 15:16:00.318 GMT-10 LOG: checkpoint starting: time
2015-02-23 15:16:53.943 GMT-10 LOG: checkpoint complete: wrote 3010
buffers (18.4%); 0 transaction log file(s) added, 0 removed, 159 recycled;
write=27.171 s, sync=25.945 s, total=53.625 s; sync files=20, longest=5.376
s, average=1.297 s; distance=2748844 kB, estimate=2748844 kB
2015-02-23 15:21:00.438 GMT-10 LOG: checkpoint starting: time
2015-02-23 15:22:01.352 GMT-10 LOG: checkpoint complete: wrote 2812
buffers (17.2%); 0 transaction log file(s) added, 0 removed, 168 recycled;
write=25.351 s, sync=35.346 s, total=60.914 s; sync files=34, longest=9.025
s, average=1.039 s; distance=1983318 kB, estimate=2672291 kB
2015-02-23 15:26:00.314 GMT-10 LOG: checkpoint starting: time
2015-02-23 15:26:25.612 GMT-10 LOG: checkpoint complete: wrote 2510
buffers (15.3%); 0 transaction log file(s) added, 0 removed, 121 recycled;
write=22.623 s, sync=2.477 s, total=25.297 s; sync files=20, longest=1.418
s, average=0.123 s; distance=2537230 kB, estimate=2658785 kB
2015-02-23 15:31:00.477 GMT-10 LOG: checkpoint starting: time
2015-02-23 15:31:25.925 GMT-10 LOG: checkpoint complete: wrote 2625
buffers (16.0%); 0 transaction log file(s) added, 0 removed, 155 recycled;
write=23.657 s, sync=1.592 s, total=25.447 s; sync files=13, longest=0.319
s, average=0.122 s; distance=2797386 kB, estimate=2797386 kB
2015-02-23 15:36:00.607 GMT-10 LOG: checkpoint starting: time
2015-02-23 15:36:52.686 GMT-10 LOG: checkpoint complete: wrote 3473
buffers (21.2%); 0 transaction log file(s) added, 0 removed, 171 recycled;
write=31.257 s, sync=20.446 s, total=52.078 s; sync files=33, longest=4.512
s, average=0.619 s; distance=2153903 kB, estimate=2733038 kB
2015-02-23 15:41:00.675 GMT-10 LOG: checkpoint starting: time
2015-02-23 15:41:25.092 GMT-10 LOG: checkpoint complete: wrote 2456
buffers (15.0%); 0 transaction log file(s) added, 0 removed, 131 recycled;
write=21.974 s, sync=2.282 s, total=24.417 s; sync files=27, longest=1.275
s, average=0.084 s; distance=2258648 kB, estimate=2685599 kB
2015-02-23 15:46:00.671 GMT-10 LOG: checkpoint starting: time
2015-02-23 15:46:26.757 GMT-10 LOG: checkpoint complete: wrote 2644
buffers (16.1%); 0 transaction log file(s) added, 0 removed, 138 recycled;
write=23.619 s, sync=2.181 s, total=26.086 s; sync files=12, longest=0.709
s, average=0.181 s; distance=2787124 kB, estimate=2787124 kB
2015-02-23 15:51:00.509 GMT-10 LOG: checkpoint starting: time
2015-02-23 15:53:30.793 GMT-10 LOG: checkpoint complete: wrote 13408
buffers (81.8%); 0 transaction log file(s) added, 0 removed, 170 recycled;
write=149.432 s, sync=0.664 s, total=150.284 s; sync files=13,
longest=0.286 s, average=0.051 s; distance=1244483 kB, estimate=2632860 kB

Above checkpoint logs are generated at the time when pg_xlog size was at
5.4G

*Code* *Review*

I had a look at the code and do not have any comments from my end.

Regards,
Venkata Balaji N

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2015-02-23 06:26:55 Re: dblink: add polymorphic functions.
Previous Message Stefan Kaltenbrunner 2015-02-23 06:03:29 Re: "multiple backends attempting to wait for pincount 1"