Re: Scalability in postgres

From: Scott Carey <scott(at)richrelevance(dot)com>
To: Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, James Mansion <james(at)mansionfamily(dot)plus(dot)com>
Cc: Flavio Henrique Araque Gurgel <flavio(at)4linux(dot)com(dot)br>, Fabrix <fabrixio1(at)gmail(dot)com>, Greg Smith <gsmith(at)gregsmith(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Scalability in postgres
Date: 2009-06-05 00:54:50
Message-ID: C64DBAEA.7515%scott@richrelevance.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


On 6/4/09 3:08 PM, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov> wrote:

> James Mansion <james(at)mansionfamily(dot)plus(dot)com> wrote:

>> I'm sorry, but (in particular) UNIX systems have routinely
>> managed large numbers of runnable processes where the run queue
>> lengths are long without such an issue.
>
> Well, the OP is looking at tens of thousands of connections. If we
> have a process per connection, how many tens of thousands can we
> handle before we get into problems with exhausting possible pid
> numbers (if nothing else)?

Well, the connections are idle much of the time. The OS doesn't really care
about these threads until they are ready to run, and even if they were all
runnable, there is little overhead in scheduling.

A context switch storm will only happen if too many threads are woken up
that must yield soon after getting to run on the CPU. If you wake up 10,000
threads, and they all can get significant work done before yielding no
matter what order they run, the system will scale extremely well.

How the lock data structures are built to avoid cache-line collisions and
minimize cache line updates can also make or break a concurrency scheme and
is a bit hardware dependent.

> I know that if you do use a large number of threads, you have to be
> pretty adaptive. In our Java app that pulls data from 72 sources and
> replicates it to eight, plus feeding it to filters which determine
> what publishers for interfaces might be interested, the Sun JVM does
> very poorly, but the IBM JVM handles it nicely. It seems they use
> very different techniques for the monitors on objects which
> synchronize the activity of the threads, and the IBM technique does
> well when no one monitor is dealing with a very large number of
> blocking threads. They got complaints from people who had thousands
> of threads blocking on one monitor, so they now keep a count and
> switch techniques for an individual monitor if the count gets too
> high.
>

A generic locking solution must be adaptive, yes. But specific solutions
tailored to specific use cases rarely need to be adaptive. I would think
that the 4 or 5 most important locks or concurrency coordination points in
Postgres have very specific, unique properties.

> Perhaps something like that (or some other new approach) might
> mitigate the effects of tens of thousands of processes competing for
> for a few resources, but it fundamentally seems unwise to turn those
> loose to compete if requests can be queued in some way.
>
> -Kevin
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

There's a bunch of useful blog posts about locks, concurrency, etc and how
they relate to low level hardware here:
http://blogs.sun.com/dave/

In particular, these are interesting references, (not only for java):

http://blogs.sun.com/dave/entry/seqlocks_in_java
http://blogs.sun.com/dave/entry/biased_locking_in_hotspot
http://blogs.sun.com/dave/entry/java_thread_priority_revisted_in
http://blogs.sun.com/dave/entry/hardware_assisted_transactional_read_set

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message david 2009-06-05 01:04:07 Re: Scalability in postgres
Previous Message david 2009-06-05 00:51:45 Re: Scalability in postgres