Re: How to monitor resources on Linux.

From: "Scott Marlowe" <scott(dot)marlowe(at)gmail(dot)com>
To: "John R Allgood" <jallgood(at)the-allgoods(dot)net>
Cc: "Andrew Sullivan" <ajs(at)crankycanuck(dot)ca>, pgsql-admin(at)postgresql(dot)org
Subject: Re: How to monitor resources on Linux.
Date: 2007-08-28 21:06:26
Message-ID: dcc563d10708281406r18c8447fkf354e0d696e4374d@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On 8/28/07, John R Allgood <jallgood(at)the-allgoods(dot)net> wrote:
>
> Here is the output from free on one of the nodes. I have seen free mem go
> as low as 15 and then go back up. Like I was saying earlier my concern was
> why the kernel started killing my postmasters. Here is the kernel message
> "kernel: oom-killer: gfp_mask=0xd0". This started happening when were
> running our midday backup. After backup runs vacuum will start up right
> after. We rebooted the servers last night and today backup and vacuum ran
> fine. Below the free out put I have added the logging out of
> /var/log/messages. Thanks this is a great list.
>
> i total used free shared
> buffers cached
> Mem: 8116 5969 2146 0 144 4318
> Low: 821 510 310
> High: 7294 5459 1835
> -/+ buffers/cache: 1506 6609
> Swap: 2000 0 1999

I'm assuming those numbers are in megabytes. If so, then they're
pretty realistic. you've 2 Gig free, and 4 Gig cached, with 144 Meg
buffer mem. Very reasonable. Have you got the output of free when
things are going wrong?

> ug 27 12:24:42 gan-lxc-01 kernel: Swap cache: add 2104, delete 2017, find
> 829/1136, race 0+0
> Aug 27 12:24:42 gan-lxc-01 kernel: 1229 bounce buffer pages
> Aug 27 12:24:42 gan-lxc-01 kernel: Free swap: 2047424kB
> Aug 27 12:24:42 gan-lxc-01 kernel: 2260992 pages of RAM
> Aug 27 12:24:42 gan-lxc-01 kernel: 1867512 pages of HIGHMEM
> Aug 27 12:24:42 gan-lxc-01 kernel: 183273 reserved pages
> Aug 27 12:24:42 gan-lxc-01 kernel: 942026 pages shared
> Aug 27 12:24:42 gan-lxc-01 kernel: 87 pages swap cached
> Aug 27 12:24:42 gan-lxc-01 kernel: Out of Memory: Killed process 19383
> (postmaster).

IS there any other context to go with this, like something from the
postgres logs at the same time, maybe top output sorted by memory...

I'm wondering if you're just running a few queries that fire really
big sorts off and that's what's getting you.

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2007-08-28 21:32:26 Re: How to monitor resources on Linux.
Previous Message John R Allgood 2007-08-28 20:46:46 Re: How to monitor resources on Linux.