Re: limiting resources to users

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: "Gauthier, Dave" <dave(dot)gauthier(at)intel(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: limiting resources to users
Date: 2009-11-26 06:34:24
Message-ID: 4B0E2170.2050903@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Gauthier, Dave wrote:
> Actually, limiting resources on a per DB basis would work for me too.
>
> I thin kOracle had a thing where you could limit resources, or at least prioritize users.

Pg isn't Oracle - for which we can be alternately extremely thankful and
somewhat frustrated. Lots of shiny features vs hellish admin, freaky SQL
syntax, and $myeyesarebleeding.

> Someone else was reporting this to me, and again, it was a MySQL DB
> that I'll eventually be replacing with PG. I'll inherit this problem
> too, so I'm being proactive in looking for a fix. He said that the
> cpu was pegged (no mention of I/O). He started killing processes left
> and write until performace was back (it was either that or a DB
> restart to clear the slate).

Argh. It could just as easily be disk I/O.

for i in `seq 1 20`; do
( dd if=/dev/md0 bs=1M seek=$(($i * 1000)) of=/dev/null &)
done

results in:

top - 14:26:20 up 3 days, 1:43, 3 users, load average: 4.70, 3.02, 1.41
Tasks: 255 total, 15 running, 240 sleeping, 0 stopped, 0 zombie
Cpu(s): 30.1%us, 65.0%sy, 0.0%ni, 0.0%id, 2.5%wa, 0.5%hi, 2.0%si, 0.0%st
Mem: 4055728k total, 3776220k used, 279508k free, 750588k buffers
Swap: 2120544k total, 4572k used, 2115972k free, 2245336k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND

1926 syslog 20 0 34552 1216 956 S 81 0.0 1:41.32 rsyslogd

32619 root 20 0 4152 1772 632 R 7 0.0 0:00.46 dd

32631 root 20 0 4152 1772 632 D 6 0.0 0:00.54 dd

32623 root 20 0 4152 1776 632 D 6 0.0 0:00.51 dd

... which could be confused for a busy CPU, but is really load caused by
disk I/O. Most of the `dd' processes are in 'D' state - ie
uninterruptable sleep in a system call - and if you check "wchan" with
"ps" you'll see that they're sleeping waiting for disk I/O.

Randomly killing processes is *not* a good way to tackle this - as I
think you already knew. Tools like `ps' (especially using custom formats
with wchan), `vmstat', `top', `iotop', `blktrace' + `blkparse',
`btrace', `top', `sar', alt-sysrq-t, etc can help you learn more about
what is going on without having to employ such brute-force tactics.

You might find that the worst performance issues are caused by terribly
written queries, poorly structured tables or views, lack of basic DB
maintenance, bad DB tuning, etc. It need not be the users' eagerness
alone - and you may find that a properly set up database can take the
reporting load and the critical load effortlessly, or close enough as
not to bother with extreme measures like separating the DBs into
multiple clusters.

--
Craig Ringer

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Craig Ringer 2009-11-26 07:29:38 Re: Wiki page on vacuum full
Previous Message Gauthier, Dave 2009-11-26 05:47:37 Re: limiting resources to users