Re: JDBC Performance

From: Tim Kientzle <kientzle(at)acm(dot)org>
To: PostgreSQL general mailing list <pgsql-general(at)postgresql(dot)org>, "Keith L(dot) Musser" <kmusser(at)idisys(dot)com>
Subject: Re: JDBC Performance
Date: 2000-09-28 03:18:20
Message-ID: 39D2B87C.32C3DB4B@acm.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> I'm finding that ... my CPU spends about 60% of the time in JDBC, and about
> 40% of the time in the postmaster backend.
> (Each query takes 2 msec on my machine, RH Linux 6.2, 733 MHz Intel, w/
> lots of memory.)

This doesn't sound too bad to me, to be honest. I've not tried using
JDBC with PostgreSQL, but I've done a lot with MySQL (and some with
Oracle, although not as recently). I'm used to seeing 5-10ms for
a fairly basic indexed query on a PII/266.

A large portion of the client-side overhead you're seeing involves
the conversion of strings into bytes for transfer over the network
(and the reverse conversion on the other side). Java strings use
Unicode, and this has to be translated into bytes for the network.
This surprises people familiar with C, but it is the "right" way
to do it; characters and bytes are not the same thing.

Some of this overhead can be reduced with a really good JIT, but
not all. Experiment with different JVMs and see if that helps any.

Several standard suggestions for improving JDBC performance:

* Cache. Keep data within the client whenever you can to reduce
the number of round-trips to the database.
* Minimize the number of queries. It often pays off big to
do a single SELECT that returns many rows rather than to do
a bunch of smaller SELECTs. Each query involves query construction
at the client, network overhead and parsing and execution overhead;
after all that, each row is relatively cheap.
* Use multi-threading, but cautiously. Because of the intrinsic delays
of communicating with a separate server, you can improve performance
by opening a couple of database connections and issuing queries over
each one. This only helps up to a point, though, and good
multi-threaded
code is hard to write in any language, including Java. This helps
less with a local server than a networked one, of course.

C can be significantly faster, simply because you can build the query
directly as an ASCII string and then just pump it over the socket
without the character-to-byte conversion overhead. Of course, that
only applies if you're using pretty simple queries. For more complex
queries or large databases, the database processing time dominates,
and nothing else really matters.

There are a lot of other factors to consider, of course. In particular,
time per query is usually less important than queries per second.
During the wait time for one transaction, other transaction can be
going on simultaneously. If you're writing servlet-based systems,
for example, you can get pretty good parallelism, especially on SMP
machines, where the DB and Java can actually run on separate processors.

- Tim

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Rafa Couto 2000-09-28 06:49:24 ALERT: VIRUS Warning (WScript.KakWorm)
Previous Message Christopher Smith 2000-09-28 01:53:16 warning - virus on the loose.