Re: cast bytea to double precision[]

From: Mathieu Dubois <mathieu(dot)dubois(at)limsi(dot)fr>
To: pgsql-novice(at)postgresql(dot)org
Subject: Re: cast bytea to double precision[]
Date: 2011-07-26 17:25:36
Message-ID: 4E2EF890.9090901@limsi.fr
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On 07/26/2011 04:30 PM, Merlin Moncure wrote:
> On Tue, Jul 26, 2011 at 2:45 AM, Mathieu Dubois<mathieu(dot)dubois(at)limsi(dot)fr> wrote:
>> Hello,
>>
>> Le 25/07/2011 17:58, Mathieu Dubois a écrit :
>>> On 07/25/2011 05:54 PM, Merlin Moncure wrote:
>>>> On Sun, Jul 24, 2011 at 2:03 PM, Mathieu
>>>> Dubois<mathieu(dot)dubois(at)limsi(dot)fr> wrote:
>>>>> I have found a solution by myself for the conversion:
>>>>> regexp_split_to_array(sig_vector, E',')::double precision[] (elements
>>>>> are in
>>>>> fact separated by commas).
>>>>>
>>>>> To convert my column I have used:
>>>>> ALTER TABLE my_table ALTER sig_vector TO double precision[] USING
>>>>> regexp_split_to_array(sig_vector, E',')::double precision[];
>>>>>
>>>>> Is that correct?
>>>>> Is it correct to pass the column name to regexp_split_to_array()?
>>>> Yeah -- you are just passing a column's data into a function as an
>>>> argument -- standard practice. This will work -- your bytea is really
>>>> a text column, so it's just a matter of breaking up the string.
>>>> regexp_* functions are great for that.
>>> Thank you very much for your reply.
>>>
>>> I will launch the conversion right now.
>>>
>> The main reason to do this was to have smaller backups.
>> The size of a compressed backup was around 1GB with bytea.
>> I have converted the columns (on a copy of the database) but the expected
>> gain are not here!
>> With double precision[] it is still around 1GB (a little bit smaller but
>> just a few MB).
>>
>> Also the size on the disk is not smaller.
>> I have listed the content of /var/lib/postgres/8.4/main/base with du and the
>> 2 versions have the same size (3.1GB).
>>
>> Does it make sense?
>> My hypothesis is that the compression algorithm is able to find regularities
>> the data so it finds the same regularity in bytea and in double precision[].
>>
>> Is there any advantage to use double precision[] over bytea in my case?
> probably not -- arrays can be significantly smaller than a set of
> individual tuples each holding one value because of the tuple
> overhead, but you still have to pay for the array header and a 4 byte
> length/null indicator per element.
>
> A packed string is often the smallest way to store data, although not
> necessarily the best. A double precision[] comes with a lot of syntax
> advantages.
Thanks for your advice!

I find the result surprising because the floats are encoded with a lot
of characters (something like 20) while a double is 8 bytes.

I have tried to run VACCUM but it changed nothing...

All of my code is based on strings so I won't take time to modify it if
there is no gain.

Mathieu

> merlin
>

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Merlin Moncure 2011-07-26 18:40:43 Re: cast bytea to double precision[]
Previous Message Merlin Moncure 2011-07-26 14:30:43 Re: cast bytea to double precision[]