Re: Fix checkpoint skip logic on idle systems by tracking LSN progress

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
Subject: Re: Fix checkpoint skip logic on idle systems by tracking LSN progress
Date: 2016-11-19 03:00:24
Message-ID: CAA4eK1K0_ume-aJ2VDEfMkv-nUzgh+9BR4xQyC1NM=mU+SYTjQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 14, 2016 at 9:33 AM, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
wrote:
> On Mon, Nov 14, 2016 at 12:49 PM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> At Sat, 12 Nov 2016 10:28:56 +0530, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
wrote in <CAA4eK1K0gGQTBxCyKqi6QnqOWGzEoVVPHCgPJ_RkOBoLPejCTA(at)mail(dot)gmail(dot)com
>
>>> I think it is good to check the performance impact of this patch on
>>> many core m/c. Is it possible for you to once check with Alexander
>>> Korotkov to see if he can provide you access to his powerful m/c which
>>> has 70 cores (if I remember correctly)?
>
> I heard about a number like that, and there is no reason to not do
> tests to be sure.
>

Okay, I have done some performance tests with this patch and found that it
doesn't have any noticeable impact which is good. Details of performance
tests is below:
Machine configuration:
2 sockets, 28 cores (56 including Hyper-Threading)
RAM = 64GB
Data directory is configured on the magnetic disk and WAL on SSD.

Non-default postgresql.conf parameters
shared_buffers=8GB
max_connections=200
bgwriter_delay=10ms
checkpoint_completion_target=0

Keeping above parameters as fixed, I have varied checkpoint_timeout for
various tests. Each of the below results is a median of 3, 15min pgbench
TPC-B tests. All the tests are performed at 64 and or 128 client-count
(Client Count = number of concurrent sessions and threads (ex. -c 8 -j
8)). All the tests are done for pgbench scale factor - 300 which means
data fits in shared buffers.

checkpoint_timeout=30s
client_count/patch_ver 64 128
HEAD 5176 6853
Patch 4963 6556
checkpoint_timeout=60s
client_count/patch_ver
64 128
HEAD 4962 6894
Patch 5228 6814
checkpoint_timeout=120s
client_count/patch_ver
64 128
HEAD 5443 7308
Patch 5453 6937
checkpoint_timeout=150s
client_count/patch_ver
128
HEAD 7316
Patch 7188

In above results, you can see that in some cases (example, for
checkpoint_time=30s, @128-client count) TPS with the patch is slightly
lower(1~5%), but I find that as a run-to-run variation, because on
repeating the tests, I could not see such regression. The reason of
keeping low values for checkpoint_timeout and bgwriter_delay is to test if
there is any impact due to new locking introduced in checkpointer and
bgwriter. The conclusion from my tests is that this patch is okay as far
as performance is concerned.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2016-11-19 03:13:10 Re: Mail thread references in commits
Previous Message Douglas Doole 2016-11-19 02:59:34 Re: possible optimizations - pushing filter before aggregation