Re: Ad-hoc table type?

From: Decibel! <decibel(at)decibel(dot)org>
To: pgsql(at)mohawksoft(dot)com
Cc: "Oleg Bartunov" <oleg(at)sai(dot)msu(dot)su>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Ad-hoc table type?
Date: 2008-10-06 16:17:35
Message-ID: FC55C97D-A80C-4648-8F73-E68D13EE2D21@decibel.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sep 29, 2008, at 6:16 AM, pgsql(at)mohawksoft(dot)com wrote:
> The hstore module, as I said,
> looks really cool, I've contemplated something like it. I have a
> module
> provides a set of accessors for an XML text column that works
> similarly,
> but it parses the XML on each access and the application has to
> create the
> XML. (I have XML creation modules for Java, PHP, C++, and standard C
> bindings.)

Yeah, "ad-hoc" storage is always a huge problem in databases. For
years the only way to do it was with EAV, which is tricky at best.

In my experience, there typically isn't an un-bounded set of possible
attribute names. It's usually fairly constrained, but the problem is
that you never know when a new one will just pop up.

It's very common right now for people to use either XML or YAML to
deal with this. That has it's own set of problems.

There's a few major improvements to be had here:

1: We should have a flexible storage mechanism that can either be
used with it's own native syntax, or can interface to other hash
formats such XML or YAML. Of course, both XML and YAML allow an
obscene amount of nesting, etc, but generally people are only using
these in a very simple form to emulate a hash table. It would be
interesting to allow casting hstore to and from other proprietary
hash formats as well, such as perl hashes.

2: Storage of attribute names can quickly become *very* expensive.
Even with short 6-10 character names, you can easily end up using
half the storage for just attribute names. I'd like to see hstore
support storing attribute names in a lookup table, or using some
other means to reduce the storage overhead.

3: Related to #2, storing numbers stinks because you end up burning 1
byte per digit. Some concept of data type for an attribute would
improve this.

Sadly, I don't have time to work on any of this. But these things are
issues to my company, and we do have money. ;)
--
Decibel!, aka Jim C. Nasby, Database Architect decibel(at)decibel(dot)org
Give your computer some brain candy! www.distributed.net Team #1828

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Decibel! 2008-10-06 16:29:51 Re: Foreign key constraint for array-field?
Previous Message bull 2008-10-06 16:01:24 subcribe