Re: checkpoint and recovering process use too much memory

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: tao tony <tonytao0505(at)outlook(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: checkpoint and recovering process use too much memory
Date: 2017-11-03 02:21:28
Message-ID: 20171103022128.GC2267@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Nov 03, 2017 at 01:43:32AM +0000, tao tony wrote:
> I had an asynchronous steaming replication HA cluster.Each node had 64G memory.pg is 9.6.2 and deployed on centos 6.
>
> Last month the database was killed by OS kernel for OOM,the checkpoint process was killed.

If you still have logs, was it killed during a large query? Perhaps one using
a hash aggregate?

> I noticed checkpoint process occupied memory for more than 20GB,and it was growing everyday.In the hot-standby node,the recovering process occupied memory as big as checkpoint process.

"resident" RAM of a postgres subprocess is often just be the fraction of
shared_buffers it's read/written. checkpointer must necessarily read all dirty
pages from s-b and write out to disk (by way of page cache), so that's why its
RSS is nearly 32GB. And the recovery process is continuously writing into s-b.

> Now In the standby node,checkpoint and recovering process used more then 50GB memory as below,and I worried someday the cluster would be killed by OS again.
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 167158 postgres 20 0 34.9g 25g 25g S 0.0 40.4 46:36.86 postgres: startup process recovering 00000004000008550000004B
> 167162 postgres 20 0 34.9g 25g 25g S 0.0 40.2 17:58.38 postgres: checkpointer process
>
> shared_buffers = 32GB

Also, what is work_mem ?

Justin

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andres Freund 2017-11-03 02:35:55 Re: checkpoint and recovering process use too much memory
Previous Message tao tony 2017-11-03 01:43:32 checkpoint and recovering process use too much memory