Re: revised hstore patch

From: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
To: tgl(at)sss(dot)pgh(dot)pa(dot)us (Tom Lane), pgsql-hackers(at)postgresql(dot)org
Subject: Re: revised hstore patch
Date: 2009-07-22 00:06:42
Message-ID: 87tz15mvhp.fsf@news-spur.riddles.org.uk
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>>>>> "Tom" == Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

Tom> Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk> writes:
>> Revision to previous hstore patch to fix (and add tests for) some edge
>> case bugs with nulls or empty arrays.

Tom> I took a quick look at this, and have a couple of beefs
Tom> associated with upgrade risks.

Tom> 1. The patch arbitrarily changes the C-code names of several
Tom> existing SQL functions.

(a) As written, it provides all of the old names too.

(b) many of the old names are significant collision risks.

(This was all discussed previously. I specifically said that
compatibility was being maintained on this point; you obviously missed
that.)

Tom> 2. The patch changes the on-disk representation of hstore. That
Tom> is clearly necessary to achieve the goal of allowing keys/values
Tom> longer than 64K, but it breaks on-disk compatibility from 8.4 to
Tom> 8.5. I'm not sure what our threshold is for allowing
Tom> compatibility breaks, but I think it's higher than this. The
Tom> demand for longer values inside an hstore has not been very
Tom> great.

The intention is that hstore(record) should work for all practically
useful record sizes. While it's possible for records to be much
larger than 1GB, in practice you're going to run into issues long
before then. Conversely, text fields over 64k are much more common.

The code already has users who are using it for audit-trail stuff
(easily computing the changes between old and new records and storing
only the differences). Perhaps one of the existing users could express
an opinion on this point.

Certainly when developing this I had _SIGNIFICANT_ encouragement, some
of it from YOU, for increasing the limit. (see for example
http://archives.postgresql.org/pgsql-hackers/2009-03/msg00577.php or
http://archives.postgresql.org/pgsql-hackers/2009-03/msg00607.php in
which alternative limits are discussed; I only noticed later that it
was possible to increase the limit to 1GB for both keys and values
without using extra space.)

Tom> Perhaps an appropriate thing to do is separate out the
Tom> representation change from the other new features, and apply
Tom> just the latter for now. Or maybe we should think about having
Tom> two versions of hstore.

Both of those options suck (and I don't believe either would suit users
of the code).

I'm prepared to give slightly more consideration to option #3: make
the new code read the old format as well as the new one. I believe
(though I have not yet tested) that it is possible to reliably
distinguish the two with relatively low overhead, though the overhead
would be nonzero, and do an in-core format conversion (which would
result in writing out the new format if anything changed).

--
Andrew (irc:RhodiumToad)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-07-22 00:13:02 Re: revised hstore patch
Previous Message Robert Haas 2009-07-21 23:58:37 Re: autogenerating headers & bki stuff