Re: Unicode database + JDBC driver performance

From: Barry Lind <blind(at)xythos(dot)com>
To: Jan Ploski <jpljpl(at)gmx(dot)de>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Unicode database + JDBC driver performance
Date: 2002-12-23 17:52:28
Message-ID: 3E074D5C.2050608@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Jan,

You say you are using 7.2.1, is that for both server and jdbc driver?
There is a performance patch in the 7.3 driver that bypasses the built
in java routines for converting to/from utf8 with a custom one. The
built in java routines are very slow on some jdks (although on jdk1.4
they are pretty good). Can you try the 7.3 drivers?

thanks,
--Barry

Jan Ploski wrote:
> Hello,
>
> I have some questions regarding PostgreSQL handling of Unicode databases
> and their performance. I am using version 7.2.1 and running two benchmarks
> against a database set up with LATIN1 encoding and the same database
> with UNICODE. The database consists of a single "test" table:
>
> Column | Type | Modifiers
> --------+---------+-----------
> id | integer | not null
> txt | text | not null
> Primary key: test_pkey
>
> The client is written in Java, it relies on the official JDBC driver,
> and is being run on the same machine as the database.
>
> Benchmark 1:
>
> Insert 10,000 rows (in 10 transactions, 1000 rows per transaction)
> into table "test". Each row contains 674 characters, most of which
> are ASCII.
>
> Benchmark 2:
>
> select * from test, repeated 10 times in a loop
>
>
> I am measuring the disk space taken by the database in each case
> (LATIN1 vs UNICODE) and the time it takes to run the benchmarks.
> I don't understand the results:
>
> Disk space change (after inserts and vacuumdb -f):
> LATIN1 UNICODE
> 764K 640K
>
> I would rather assume that the Unicode database takes more space,
> even 2 times as more.. Apparently not (and that's nice).
>
> Avg. Benchmark execution times (obtained with the 'time' command, repeatedly):
> Benchmark 1:
> LATIN1 UNICODE
> 11.5s 14.5s
>
> Benchmark 2:
> LATIN1 UNICODE
> 4.7s 8.6s
>
> The Unicode database is slower both on INSERTs and especially on
> SELECTs. I am wondering why. Since Java uses Unicode internally,
> shouldn't it actually be more efficient to store/retrieve character
> data in that format, with no recoding? Maybe it is an issue with the
> JDBC driver? Or is handling Unicode inherently much slower on the
> backend side?
>
> Take care -
> JPL
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2002-12-23 18:06:33 Re: panic: postmaster restart failed
Previous Message Çagil Seker 2002-12-23 17:19:00 YNT: Changes to pg_hba.conf not effective until after restart