strange pgbench results (as if blocked at the end)

From: "Tomas Vondra" <tv(at)fuzzy(dot)cz>
To: pgsql-performance(at)postgresql(dot)org
Subject: strange pgbench results (as if blocked at the end)
Date: 2011-08-12 23:37:19
Message-ID: 2f9ba8ed6e4ed11d9a6d236cb4d2a2ec.squirrel@sq.gransy.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

I've run a lot of pgbench tests recently (trying to compare various fs,
block sizes etc.), and I've noticed several really strange results.

Eeach benchmark consists of three simple steps:

1) set-up the database
2) read-only run (10 clients, 5 minutes)
3) read-write run (10 clients, 5 minutes)

with a short read-only warm-up (1 client, 1 minute) before each run.

I've run nearly 200 of these, and in about 10 cases I got something that
looks like this:

http://www.fuzzy.cz/tmp/pgbench/tps.png
http://www.fuzzy.cz/tmp/pgbench/latency.png

i.e. it runs just fine for about 3:40 and then something goes wrong. The
bench should take 5:00 minutes, but it somehow locks, does nothing for
about 2 minutes and then all the clients end at the same time. So instead
of 5 minutes the run actually takes about 6:40.

The question is what went wrong - AFAIK there's nothing else running on
the machine that could cause this. I'm looking for possible culprits -
I'll try to repeat this run and see if it happens again.

The pgbench log is available here (notice the 10 lines at the end, those
are the 10 blocked clients) along with the postgres.log

http://www.fuzzy.cz/tmp/pgbench/pgbench.log.gz
http://www.fuzzy.cz/tmp/pgbench/pg.log

Ignore the "immediate shutdown request" warning (once the benchmark is
over, I don't need it anymore. Besides that there's just a bunch of
"pgstat wait timeout" warnings (which makes sense, because the pgbench run
does a lot of I/O).

I'd understand a slowdown, but why does it block?

I'm using PostgreSQL 9.0.4, the machine has 2GB of RAM and 1GB of shared
buffers. I admit the machine might be configured a bit differently (e.g.
smaller shared buffers) but I've seen just about 10 such strange results
out of 200 runs, so I doubt this is the cause.

I was thinking about something like autovacuum, but I'd expect that to
happen much more frequently (same config, same workload, etc.). And it
happens with just some file systems.

For example for ext3/writeback, the STDDEV(latency) looks like this
(x-axis represents PostgreSQL block size, y-axis fs block size):

http://www.fuzzy.cz/tmp/pgbench/ext3-writeback.png

while for ext4/journal:

http://www.fuzzy.cz/tmp/pgbench/ext4-journal.png

thanks
Tomas

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Craig Ringer 2011-08-13 00:18:48 Re: strange pgbench results (as if blocked at the end)
Previous Message hyelluas 2011-08-12 21:29:51 How to see memory usage using explain analyze ?