Re: Database storage bloat -- FOLLOWUP

From: reina_ga(at)hotmail(dot)com (Tony Reina)
To: pgsql-admin(at)postgresql(dot)org
Subject: Re: Database storage bloat -- FOLLOWUP
Date: 2004-04-11 20:57:40
Message-ID: 272e4be7.0404111257.555ad4a7@posting.google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

I thought I'd make a followup to the question of storage bloat. I
tried approaching the problem by living a little loosely with database
normalization and use arrays instead of flattening everything out in
the tables.

So my original table,

CREATE TABLE original (
dbIndex integer,
index1 smallint,
index2 smallint,
index3 smallint,
dvalue real
):

becomes

CREATE TABLE newtable (
dbIndex integer,
dvalue real[][][]
);

Now I'd expect to get better storage here just because I'm dropping 3
smallint fields (so 2 bytes x 3 for each value, or a 6 fold decrease).

However, in actuality I'm seeing a 30 fold decrease:

original table = 48 MB
newtable = 1.6 MB !!!!!

If I'm understanding the data directory correctly, the array tuples
are being toasted (which I think must compress them). The actual table
in the newtable format is only 8K and the pg_toast that goes with this
table is 1.6MB.

So I'm very pleased with the storage here. Plus, reads are faster than
the original design. All in all, I think I've found a good solution
for my kind of data.

Now I know that the use of arrays is considered taboo in the books,
but in my case the data being stored actually do fall into an array
(matrix) structure naturally (it's a timebased recording so there's a
sample for each time point and each X,Y position -- a natural 3D
matrix).

What are the possible downsides that I may be missing?

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Ashvinder Singh 2004-04-12 04:28:01 Maximum Number Tables
Previous Message gajanan 2004-04-10 04:42:36 Postgresql Backup & Dump