Skip site navigation (1) Skip section navigation (2)

Bytea network traffic: binary vs text result format

From: "Miha D(dot) Puc" <miha(dot)puc(at)eba(dot)si>
To: pgsql-interfaces(at)postgresql(dot)org
Subject: Bytea network traffic: binary vs text result format
Date: 2007-06-03 10:36:23
Message-ID: 466299A7.20403@eba.si (view raw or flat)
Thread:
Lists: pgsql-interfaces
Hi!

There was some debate recently about using text or binary format.
There's people who would like to use it but have trouble converting
binary encoded results into native types and there's people that say
there's not much performance difference.

I'd like to stress that performance is very different over slow
network.  The biggest difference is for byte where the text format
performance is about 3.5 times worse at inserts and updates and about
2.9 times worse at selects . Here's the reasoning:

In text format bytea are escaped using PQescapeBytea. In an average
binary stream about 2/3 would be escaped. Each escaped byte becomes of
form \\ooo at upload and of form \ooo for download, so the size of the
escaped stream is 1/3 + 2/3 * 5 = 11/3 = 3.6 and 1/3 + 2/3 * 4 = 3
respectively.

Here are the results of my test. I inserted and selected an OpenOffice
document of size 2Mb over a 2M/512K cable.
text format:
  insert: 120.1s
  select: 24.9s
binary format:
  insert: 33.5s
  select: 8.6s
factor: 
  insert: 3.6
  select: 2.9

The difference between the test and the above calculation comes from the
estimate that 2/3 of bytes are escaped where in fact 95 out of 256 are
escaped (63%).

So there is a need (people asking) and reason (performance) to use
binary format. But there's a huge drawback - the conversions. It's easy
for varchar, not too bad for basic types (int, float, bool), effort is
needed for timestamp, date, time and numeric is a pain.

So with all the above there should be a utility for conversion between
binary format and native types and/or string format in libpq.

Regards,
Miha Puc



Responses

pgsql-interfaces by date

Next:From: Markus SchiltknechtDate: 2007-06-04 12:54:39
Subject: Re: Bytea network traffic: binary vs text result format
Previous:From: Francisco Figueiredo Jr.Date: 2007-05-30 15:23:42
Subject: Re: Question on NpgsqlParameter object

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group