Re: sustained update load of 1-2k/sec

From: Bob Ippolito <bob(at)redivi(dot)com>
To: Mark Cotner <mcotner(at)yahoo(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: sustained update load of 1-2k/sec
Date: 2005-08-19 09:09:27
Message-ID: 67788A4A-4011-49AB-B329-683FD9532661@redivi.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On Aug 18, 2005, at 10:24 PM, Mark Cotner wrote:

> I'm currently working on an application that will poll
> thousands of cable modems per minute and I would like
> to use PostgreSQL to maintain state between polls of
> each device. This requires a very heavy amount of
> updates in place on a reasonably large table(100k-500k
> rows, ~7 columns mostly integers/bigint). Each row
> will be refreshed every 15 minutes, or at least that's
> how fast I can poll via SNMP. I hope I can tune the
> DB to keep up.
>
> The app is threaded and will likely have well over 100
> concurrent db connections. Temp tables for storage
> aren't a preferred option since this is designed to be
> a shared nothing approach and I will likely have
> several polling processes.

Somewhat OT, but..

The easiest way to speed that up is to use less threads. You're
adding a whole TON of overhead with that many threads that you just
don't want or need. You should probably be using something event-
driven to solve this problem, with just a few database threads to
store all that state. Less is definitely more in this case. See
<http://www.kegel.com/c10k.html> (and there's plenty of other
literature out there saying that event driven is an extremely good
way to do this sort of thing).

Here are some frameworks to look at for this kind of network code:
(Python) Twisted - <http://twistedmatrix.com/>
(Perl) POE - <http://poe.perl.org/>
(Java) java.nio (not familiar enough with the Java thing to know
whether or not there's a high-level wrapper)
(C++) ACE - <http://www.cs.wustl.edu/~schmidt/ACE.html>
(Ruby) IO::Reactor - <http://www.deveiate.org/code/IO-Reactor.html>
(C) libevent - <http://monkey.org/~provos/libevent/>

.. and of course, you have select/poll/kqueue/WaitNextEvent/whatever
that you could use directly, if you wanted to roll your own solution,
but don't do that.

If you don't want to optimize the whole application, I'd at least
just push the DB operations down to a very small number of
connections (*one* might even be optimal!), waiting on some kind of
thread-safe queue for updates from the rest of the system. This way
you can easily batch those updates into transactions and you won't be
putting so much unnecessary synchronization overhead into your
application and the database.

Generally, once you have more worker threads (or processes) than
CPUs, you're going to get diminishing returns in a bad way, assuming
those threads are making good use of their time.

-bob

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Mark Cotner 2005-08-19 10:14:54 Re: sustained update load of 1-2k/sec
Previous Message Mark Cotner 2005-08-19 08:24:04 sustained update load of 1-2k/sec