Re: Anyone working on better transaction locking?

From: Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Anyone working on better transaction locking?
Date: 2003-04-12 06:51:12
Message-ID: 200304121221.12377.shridhar_daithankar@nospam.persistent.co.in
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Saturday 12 April 2003 03:02, you wrote:
> Ron Peacetree wrote:
> > Zeus had a performance ceiling roughly 3x that of Apache when Zeus
> > supported threading as well as pre-forking and Apache only supported
> > pre forking. The Apache folks now support both. DB2, Oracle, and SQL
> > Server all use threads. Etc, etc.
>
> You can't use Apache as an example of why you should thread a database
> engine, except for the cases where the database is used much like the
> web server is: for numerous short transactions.

OK. Let me put my experiences. These are benchmarks on a intranet(100MBps lan)
run off a 1GHZ P-III/IV webserver on mandrake9 for a single 8K file.

apache2044: 1300 rps
boa: 4500rps
Zeus: 6500 rps.

Apache does too many things to be a speed daemon and what it offers is pretty
impressive from performance POV.

But database is not webserver. It is not suppose to handle tons of concurrent
requests. That is a fundamental difference.

>
> > That's an awful lot of very bright programmers and some serious $$
> > voting that threads are worth it. Given all that, if PostgreSQL
> > specific thread support is =not= showing itself to be a win that's
> > an unexpected enough outcome that we should be asking hard questions
> > as to why not.
>
> It's not that there won't be any performance benefits to be had from
> threading (there surely will, on some platforms), but gaining those
> benefits comes at a very high development and maintenance cost. You
> lose a *lot* of robustness when all of your threads share the same
> memory space, and make yourself vulnerable to classes of failures that
> simply don't happen when you don't have shared memory space.

Well. Threading does not necessarily imply one thread per connection model.
Threading can be used to make CPU work during I/O and taking advantage of SMP
for things like sort etc. This is especially true for 2.4.x linux kernels
where async I/O can not be used for threaded apps. as threads and signal do
not mix together well.

One connection per thread is not a good model for postgresql since it has
already built a robust product around process paradigm. If I have to start a
new database project today, a mix of process+thread is what I would choose bu
postgresql is not in same stage of life.

> > At their core, threads are a context switching efficiency tweak.
>
> This is the heart of the matter. Context switching is an operating
> system problem, and *that* is where the optimization belongs. Threads
> exist in large part because operating system vendors didn't bother to
> do a good job of optimizing process context switching and
> creation/destruction.

But why would a database need a tons of context switches if it is not supposed
to service loads to request simaltenously? If there are 50 concurrent
connections, how much context switching overhead is involved regardless of
amount of work done in a single connection? Remeber that database state is
maintened in shared memory. It does not take a context switch to access it.

The assumption stems from database being very efficient in creating and
servicing a new connection. I am not very comfortable with that argument.

> Under Linux, from what I've read, process creation/destruction and
> context switching happens almost as fast as thread context switching
> on other operating systems (Windows in particular, if I'm not
> mistaken).

I hear solaris also has very heavy processes. But postgresql has other issues
with solaris as well.
>
> > Since DB's switch context a lot under many circumstances, threads
> > should be a win under such circumstances. At the least, it should be
> > helpful in situations where we have multiple CPUs to split query
> > execution between.

Can you give an example where database does a lot of context switching for
moderate number of connections?

Shridhar

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stu Krone 2003-04-12 09:03:55 Re: How do you execute a postgresql function from perl?
Previous Message Joe Conway 2003-04-12 06:41:35 OT: cvsup for Red Hat 9 or rsync cvs