Re: Queries sometimes take 1000 times the normal time

From: "Shridhar Daithankar" <shridhar_daithankar(at)persistent(dot)co(dot)in>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Queries sometimes take 1000 times the normal time
Date: 2003-08-29 08:30:34
Message-ID: 3F4F5C82.18176.44007A5@localhost
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 28 Aug 2003 at 20:16, Anders K. Pedersen wrote:

> Shridhar Daithankar wrote:
> > On 28 Aug 2003 at 1:07, Anders K. Pedersen wrote:
> >>We're running a set of Half-Life based game servers that lookup user
> >>privileges from a central PostgreSQL 7.3.4 database server (I recently
> >>ported the MySQL code in Adminmod to PostgreSQL to be able to do this).
> >>
> >>The data needed by the game servers are combined from several different
> >>tables, so we have some views set up to provide the data in the format
> >>needed.
> >>
> >>Currently there's only a few users in the database for testing purposes,
> >>and most of the time the user lookup's take 2-3 ms (I have syslog'ing of
> >>queries and duration turned on), but several times per hour the duration
> >>for one of the queries is 2-3 seconds (1000 times larger), while the
> >>surrounding lookups take the usual 2-3 ms.
> >
> >
> > Check vmstat during the same period if it is syncing at that point as Tom
> > suggested.
>
> I've been running a vmstat 1 logging process for a while now, and the
> sample below shows what happende around one of these spikes - at
> 18:18:03 specifically (actually there were two 1 second long queries,
> that finished at 18:18:03).
>
> Thu Aug 28 18:17:53 2003 0 0 0 40904 4568 22288 404352 0 0
> 12 0 181 362 2 1 97
> Thu Aug 28 18:17:54 2003 0 0 0 40904 4580 22260 404380 0 0
> 128 0 205 330 2 3 95
> Thu Aug 28 18:17:55 2003 0 0 2 40904 4576 22264 404380 0 0
> 0 284 224 127 0 1 99
> Thu Aug 28 18:17:56 2003 0 0 2 40904 5008 22268 404512 0 0
> 128 728 571 492 2 3 95
> Thu Aug 28 18:17:57 2003 0 0 1 40904 5000 22276 404512 0 0
> 0 120 201 181 1 0 99
> Thu Aug 28 18:17:58 2003 0 0 1 40904 4936 22284 404528 0 0
> 8 0 1147 2204 12 3 85
> Thu Aug 28 18:17:59 2003 0 0 0 40904 4784 22304 404660 0 0
> 148 0 2112 3420 2 3 95
> Thu Aug 28 18:18:00 2003 1 1 3 40904 4760 22324 404664 0 0
> 20 456 2374 3277 2 1 97
> Thu Aug 28 18:18:01 2003 0 2 10 40904 4436 22000 401456 0 0
> 144 540 510 457 11 6 83
> Thu Aug 28 18:18:02 2003 1 1 2 40904 8336 22032 401512 0 0
> 68 676 1830 2540 4 3 93
> Thu Aug 28 18:18:04 2003 1 0 1 40904 8160 22052 401664 0 0
> 140 220 2308 3253 2 3 95
> Thu Aug 28 18:18:05 2003 0 0 1 40904 7748 22064 402064 0 0
> 288 0 1941 2856 1 3 96
> Thu Aug 28 18:18:06 2003 0 0 3 40904 6704 22064 403100 0 0
> 496 992 2326 3510 0 5 95
> Thu Aug 28 18:18:07 2003 1 0 0 40904 6324 22088 402716 0 0
> 260 188 1984 2927 11 4 85
> Thu Aug 28 18:18:08 2003 0 0 0 40904 6920 22088 402828 0 0
> 72 0 419 1473 17 5 78
> Thu Aug 28 18:18:09 2003 0 0 0 40904 6784 22088 402964 0 0
> 128 0 235 476 2 1 97
> Thu Aug 28 18:18:10 2003 0 0 1 40904 6404 22088 402980 0 0
> 0 0 343 855 14 2 84
>

Notice a pattern. In first few entries free memory is increasing coupled with
IO. Few entries down the line it's decreasing again with IO. I would guess
that something terminated and started..

However given how idle CPU is, I wonder would it matter. Besides changes in
stats are pretty small to be any significant. I wouldn't worry about context
switches or interrupts. They don't seem to be any dramatic..

I wonder what kernel you are using. While running pgbench on 2.4 and 2.6 couple
of days back, I noticed several stalls with 2.4 where neither CPU or disk does
anything but nothing moves forward, for 30 sec. or so.

If possible try with 2.6 Check which scheduler you are using and which works
best for you.

http://marc.theaimsgroup.com/?l=linux-kernel&m=105743728122143&w=2

If you want a step by step how to install 2.6, I can give that too.. It's
pretty simple..

> > Are you using pooled connections? If yes you could shorten life of a connection
> > and force making a new connection every 10-15 minutes say. That would avoid IO
> > avelanche at the end of the hour types.
>
> I'm not quite sure, what you mean by "pooled connections". Each game
> server has one connection to the PostgreSQL server, which is opened,
> when the server is first started, and then never closed (until the game
> server terminates, but there's days between this happens).

I would say let each server start a new connection every 15 minutes. As soon as
new connection is established, close old one.

See if that cures the problem. Apache uses similar methods albeit it measure by
nubmer of requests served by each child.

HTH

Bye
Shridhar

--
It is necessary to have purpose. -- Alice #1, "I, Mudd", stardate 4513.3

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Alexander Priem 2003-08-29 08:41:04 Re: Indexing question
Previous Message Shridhar Daithankar 2003-08-29 08:18:24 Re: Hardware recommendations to scale to silly load