Re: cast bytea to double precision[]

From: Merlin Moncure <mmoncure(at)gmail(dot)com>
To: Mathieu Dubois <mathieu(dot)dubois(at)limsi(dot)fr>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: cast bytea to double precision[]
Date: 2011-07-26 14:30:43
Message-ID: CAHyXU0zP9tcLODN2iAw5DiYWPY50uvJz30V1duT_4QHqcMSNow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On Tue, Jul 26, 2011 at 2:45 AM, Mathieu Dubois <mathieu(dot)dubois(at)limsi(dot)fr> wrote:
> Hello,
>
> Le 25/07/2011 17:58, Mathieu Dubois a écrit :
>>
>> On 07/25/2011 05:54 PM, Merlin Moncure wrote:
>>>
>>> On Sun, Jul 24, 2011 at 2:03 PM, Mathieu
>>> Dubois<mathieu(dot)dubois(at)limsi(dot)fr> wrote:
>>>>
>>>> I have found a solution by myself for the conversion:
>>>> regexp_split_to_array(sig_vector, E',')::double precision[] (elements
>>>> are in
>>>> fact separated by commas).
>>>>
>>>> To convert my column I have used:
>>>> ALTER TABLE my_table ALTER sig_vector TO double precision[] USING
>>>> regexp_split_to_array(sig_vector, E',')::double precision[];
>>>>
>>>> Is that correct?
>>>> Is it correct to pass the column name to regexp_split_to_array()?
>>>
>>> Yeah -- you are just passing a column's data into a function as an
>>> argument -- standard practice. This will work -- your bytea is really
>>> a text column, so it's just a matter of breaking up the string.
>>> regexp_* functions are great for that.
>>
>> Thank you very much for your reply.
>>
>> I will launch the conversion right now.
>>
> The main reason to do this was to have smaller backups.
> The size of a compressed backup was around 1GB with bytea.
> I have converted the columns (on a copy of the database) but the expected
> gain are not here!
> With double precision[] it is still around 1GB (a little bit smaller but
> just a few MB).
>
> Also the size on the disk is not smaller.
> I have listed the content of /var/lib/postgres/8.4/main/base with du and the
> 2 versions have the same size (3.1GB).
>
> Does it make sense?
> My hypothesis is that the compression algorithm is able to find regularities
> the data so it finds the same regularity in bytea and in double precision[].
>
> Is there any advantage to use double precision[] over bytea in my case?

probably not -- arrays can be significantly smaller than a set of
individual tuples each holding one value because of the tuple
overhead, but you still have to pay for the array header and a 4 byte
length/null indicator per element.

A packed string is often the smallest way to store data, although not
necessarily the best. A double precision[] comes with a lot of syntax
advantages.

merlin

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Mathieu Dubois 2011-07-26 17:25:36 Re: cast bytea to double precision[]
Previous Message Mathieu Dubois 2011-07-26 07:45:17 Re: cast bytea to double precision[]