Re: Question on pgbench output

From: David Kerr <dmk(at)mr-paradox(dot)net>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Question on pgbench output
Date: 2009-04-03 23:34:58
Message-ID: 20090403233458.GC54342@mr-paradox.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Fri, Apr 03, 2009 at 06:52:26PM -0400, Tom Lane wrote:
- Greg Smith <gsmith(at)gregsmith(dot)com> writes:
- > pgbench is extremely bad at simulating large numbers of clients. The
- > pgbench client operates as a single thread that handles both parsing the
- > input files, sending things to clients, and processing their responses.
- > It's very easy to end up in a situation where that bottlenecks at the
- > pgbench client long before getting to 400 concurrent connections.
-
- Yeah, good point.

hmmm ok, I didn't realize that pgbouncer wasn't threaded. I've got a Plan B
that doesn't use pgbouncer that i'll try.

- > That said, if you're in the hundreds of transactions per second range that
- > probably isn't biting you yet. I've seen it more once you get around
- > 5000+ things per second going on.
-
- However, I don't think anyone else has been pgbench'ing transactions
- where client-side libpq has to absorb (and then discard) a megabyte of
- data per xact. I wouldn't be surprised that that eats enough CPU to
- make it an issue. David, did you pay any attention to how busy the
- pgbench process was?
I can run it again and have a look, no problem.

- Another thing that strikes me as a bit questionable is that your stated
- requirements involve being able to pump 400MB/sec from the database
- server to your various client machines (presumably those 400 people
- aren't running their client apps directly on the DB server). What's the
- network fabric going to be, again? Gigabit Ethernet won't cut it...

Yes, sorry I'm not trying to be confusing but i didn't want to bog
everyone down with a ton of details.

400 concurrent users doesn't mean that they're pulling 1.5 megs / second
every second. Just that they could potentially pull 1.5 megs at any one
second. most likely there is a 6 (minimum) to 45 second (average) gap
between each individual user's pull. My plan B above emulates that, but
i was using pgbouncer to try to emulate "worst case" scenario.

- The point I was trying to make is that it's the disk subsystem, not
- the CPU, that is going to make or break you.

Makes sense, I definitely want to avoid I/Os.

On Fri, Apr 03, 2009 at 05:51:50PM -0400, Greg Smith wrote:
- Wrapping a SELECT in a BEGIN/END block is unnecessary, and it will
- significantly slow down things for two reason: the transactions
overhead
- and the time pgbench is spending parsing/submitting those additional
- lines. Your script should be two lines long, the \setrandom one and
the
- SELECT.
-

Oh perfect, I can try that too. thanks

- The thing that's really missing from your comments so far is the cold
- vs. hot cache issue: at the point when you're running pgbench, is a lot

I'm testing with a cold cache because most likely the way the items are
spead out, of those 400 users only a few at a time might access similar
items.

- Wait until Monday, I'm announcing some pgbench tools at PG East this
- weekend that will take care of all this as well as things like
- graphing. It pushes all the info pgbench returns, including the latency
- information, into a database and generates a big stack of derived reports.
- I'd rather see you help improve that than reinvent this particular wheel.

Ah very cool, wish i could go (but i'm on the west coast).

Thanks again guys.

Dave Kerr

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David Kerr 2009-04-03 23:47:49 Re: Question on pgbench output
Previous Message Josh Berkus 2009-04-03 23:12:55 Using IOZone to simulate DB access patterns