Vlad Romascanu wrote:
> Problem occurs when running (in production) Postgres 8.3.7 64-bit (from
> RPM) on Ubuntu 8.04.2, on an Amazon EC2 (xen) "Large" instance (8GB
> RAM), with the DB on a 50GB EC2 block device.
Hmm - don't know what the characteristics of running PG on EC2 are. This
might be something peculiar to that.
> Problem does not occur when running (in staging/pre-production) Postgres
> 8.3.5 32-bit (from RPM) on Ubuntu 8.04.1, on a less beefy Amazon EC2
> (xen) "Small" instance, with the DB on a 5GB EC2 block device.
> I am running with slow query logging on, and somewhat inexplicably I've
> been getting the following slow UPDATE query several times in the past
> weeks (I'm also including some context lines above and below):
>>> 2009-09-14 08:12:30.363 UTC user(at)database pid=26474
>>> ip=127.0.0.1(58364) sid=4aadf58d.676a:1 LOG: duration: 13472.892 ms
>>> statement: UPDATE "document_sets" SET "status" = E'rejected',
> This is one of the "faster" occurrences; at times the query has been
> logged as having taken 100+ seconds:
That's *very* slow indeed, and clearly the query itself is simple enough.
Typically in a situation like this you might suspect checkpointing was
the problem. Lots of dirty disk pages being flushed to disk before a
checkpoint. The stats for disk activity you give don't support that
idea, although 10 minute intervals is quite far apart.
Your table-stats show this is a small table. If it's updated a lot then
it might be that your autovacuum settings aren't high enough for this
table. The log_autovacuum_min_duration setting might be worth enabling
too - to see if autovacuum is taking a long time over anything.
Another thing that can cause irregular slowdowns is if you have a
trigger with some custom code that takes an unexpectedly long time to
run (takes locks, runs a query that plans badly occasionally). I don't
know if that's the case here.
Oh, if you don't have indexes on "status" or "updated_at" then you might
want to read up on HOT and decrease your fill-factor on the table too.
That's unrelated to this though.
It looks like the problem is common enough that you could have a small
script check pg_stat_activity once every 10 seconds and dump a snapshot
of pg_locks, vmstat etc. If you can catch the problem happening that
should make it easy to diagnose.
In response to
pgsql-performance by date
|Next:||From: Craig Ringer||Date: 2009-09-17 08:00:31|
|Subject: Re: noapic option|
|Previous:||From: Grzegorz Jaśkiewicz||Date: 2009-09-16 09:34:42|
|Subject: Re: CLUSTER and a problem|