Re: Compression in PG

From: Eduardo Morras <emorras(at)s21sec(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Compression in PG
Date: 2009-11-02 09:51:06
Message-ID: 20091102095349.45474558AA6@s21sec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

At 05:24 02/11/2009, you wrote:

>The only reason I can think of for wanting to compress very small
>datums is if you have a gajillion of them, they're highly
>compressible, and you have extra CPU time coming out of your ears. In
>that case - yeah, you might want to think about pre-compressing them
>outside of Pg. If you're doing this for some other reason you could
>probably get some better advice if you explain what it is...
>
>...Robert

There is another reason. If you compress (lossless) all the small text datums with the same algorithm, you get unique and smaller representations of the text that can be used as primary unique keys. You can see it like a variable length hashing algorithm.

Depending the compression method (f.ex. static huffman) you can compare 2 texts using only the compressed versions or sort them faster. This cannot be done with the actual LZ algorithm.

--------------------------------
Eduardo Morrás González
Dept. I+D+i e-Crime Vigilancia Digital
S21sec Labs
Tlf: +34 902 222 521
Móvil: +34 555 555 555
www.s21sec.com, blog.s21sec.com

Salvo que se indique lo contrario, esta información es CONFIDENCIAL y
contiene datos de carácter personal que han de ser tratados conforme a la
legislación vigente en materia de protección de datos. Si usted no es
destinatario original de este mensaje, le comunicamos que no está autorizado
a revisar, reenviar, distribuir, copiar o imprimir la información en él
contenida y le rogamos que proceda a borrarlo de sus sistemas.

Kontrakoa adierazi ezean, posta elektroniko honen barruan doana ISILPEKO
informazioa da eta izaera pertsonaleko datuak dituenez, indarrean dagoen
datu pertsonalak babesteko legediaren arabera tratatu beharrekoa. Posta
honen hartzaile ez zaren kasuan, jakinarazten dizugu baimenik ez duzula
bertan dagoen informazioa aztertu, igorri, banatu, kopiatu edo inprimatzeko.
Hortaz, erregutzen dizugu posta hau zure sistemetatik berehala ezabatzea.

Antes de imprimir este mensaje valora si verdaderamente es necesario. De
esta forma contribuimos a la preservación del Medio Ambiente.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Peter Meszaros 2009-11-02 12:50:54 Re: database size growing continously
Previous Message Robert Haas 2009-11-02 04:24:47 Re: Compression in PG