Skip site navigation (1) Skip section navigation (2)

Re: How to monitor resources on Linux.

From: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To: "John R Allgood" <jallgood(at)the-allgoods(dot)net>
Cc: "Andrew Sullivan" <ajs(at)crankycanuck(dot)ca>, pgsql-admin(at)postgresql(dot)org
Subject: Re: How to monitor resources on Linux.
Date: 2007-08-28 21:06:26
Message-ID: dcc563d10708281406r18c8447fkf354e0d696e4374d@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-admin
On 8/28/07, John R Allgood <jallgood(at)the-allgoods(dot)net> wrote:
>
>  Here is the output from free on one of the nodes. I have seen free mem go
> as low as 15 and then go back up. Like I was saying earlier my concern was
> why the kernel started killing my postmasters. Here is the kernel message
> "kernel: oom-killer: gfp_mask=0xd0". This started happening when were
> running our midday backup. After backup runs vacuum will start up right
> after. We rebooted the servers last night and today backup and vacuum ran
> fine. Below the free out put I have added the logging out of
> /var/log/messages. Thanks this is a great list.
>
>  i                             total       used       free     shared
> buffers     cached
>  Mem:          8116       5969       2146          0        144       4318
>  Low:           821        510        310
>  High:         7294       5459       1835
>  -/+ buffers/cache:       1506       6609
>  Swap:         2000          0       1999

I'm assuming those numbers are in megabytes.  If so, then they're
pretty realistic.  you've 2 Gig free, and 4 Gig cached, with 144 Meg
buffer mem.  Very reasonable.  Have you got the output of free when
things are going wrong?

>  ug 27 12:24:42 gan-lxc-01 kernel: Swap cache: add 2104, delete 2017, find
> 829/1136, race 0+0
>  Aug 27 12:24:42 gan-lxc-01 kernel: 1229 bounce buffer pages
>  Aug 27 12:24:42 gan-lxc-01 kernel: Free swap:       2047424kB
>  Aug 27 12:24:42 gan-lxc-01 kernel: 2260992 pages of RAM
>  Aug 27 12:24:42 gan-lxc-01 kernel: 1867512 pages of HIGHMEM
>  Aug 27 12:24:42 gan-lxc-01 kernel: 183273 reserved pages
>  Aug 27 12:24:42 gan-lxc-01 kernel: 942026 pages shared
>  Aug 27 12:24:42 gan-lxc-01 kernel: 87 pages swap cached
>  Aug 27 12:24:42 gan-lxc-01 kernel: Out of Memory: Killed process 19383
> (postmaster).

IS there any other context to go with this, like something from the
postgres logs at the same time, maybe top output sorted by memory...

I'm wondering if you're just running a few queries that fire really
big sorts off and that's what's getting you.

In response to

pgsql-admin by date

Next:From: Tom LaneDate: 2007-08-28 21:32:26
Subject: Re: How to monitor resources on Linux.
Previous:From: John R AllgoodDate: 2007-08-28 20:46:46
Subject: Re: How to monitor resources on Linux.

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group