Re: Client/Server compression?

From: Greg Copeland <greg(at)CopelandConsulting(dot)Net>
To: Arguile <arguile(at)lucentstudios(dot)com>
Cc: PostgresSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Client/Server compression?
Date: 2002-03-15 18:47:20
Message-ID: 1016218040.24597.15.camel@mouse.copelandconsulting.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2002-03-14 at 14:03, Arguile wrote:

[snip]

> I'm sceptical of the benefit such compressions would provide in this setting
> though. We're dealing with sets that would have to be compressed every time
> (no caching) which might be a bit expensive on a database server. Having it
> as a default off option for psql migtht be nice, but I wonder if it's worth
> the time, effort, and cpu cycles.
>

I dunno. That's a good question. For now, I'm making what tends to be
a safe assumption (opps...that word), that most database servers will be
I/O bound rather than CPU bound. *IF* that assumption hold true, it
sounds like it may make even more sense to implement this. I do know
that in the past, I've seen 90+% compression ratios on many databases
and 50% - 90% compression ratios on result sets using tunneled
compression schemes (which were compressing things other than datasets
which probably hurt overall compression ratios). Depending on the work
load and the available resources on a database system, it's possible
that latency could actually be reduced depending on where you measure
this. That is, do you measure latency as first packet back to remote or
last packet back to remote. If you use last packet, compression may
actually win.

My current thoughts are to allow for enabled/disabled compression and
variable compression settings (1-9) within a database configuration.
Worse case, it may be fun to implement and I'm thinking there may
actually be some surprises as an end result if it's done properly.

In looking at the communication code, it looks like only an 8k buffer is
used. I'm currently looking to bump this up to 32k as most OS's tend to
have a sweet throughput spot with buffer sizes between 32k and 64k.
Others, depending on the devices in use, like even bigger buffers.
Because of the fact that this may be a minor optimization, especially on
a heavily loaded server, we may want to consider making this a
configurable parameter.

Greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Serguei Mokhov 2002-03-15 18:49:21 FYI: Fw: [General] master thesis defence, Xin Shen, Wed. March 27, 16:00, H 601
Previous Message Mikheev, Vadim 2002-03-15 18:36:16 Re: Bug #613: Sequence values fall back to previously chec