Re: [HACKERS] LZTEXT for rule plan stings

From: wieck(at)debis(dot)com (Jan Wieck)
To: Don Baccus <dhogaza(at)pacifier(dot)com>
Cc: Hannu Krosing <hannu(at)tm(dot)ee>, Jan Wieck <wieck(at)debis(dot)com>, PostgreSQL HACKERS <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [HACKERS] LZTEXT for rule plan stings
Date: 2000-02-26 03:06:05
Message-ID: m12OXYT-0003kgC@orion.SAPserv.Hamburg.dsh.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Don Baccus wrote:

> At 03:02 AM 2/26/00 +0200, Hannu Krosing wrote:
> >Don Baccus wrote:
> >>
> >>
> >> I still like the idea of "text" being implemented under the hood
> >> as lzText for a quick 7.1 release if that idea works out ...
> >
> >But without TOAST it would result in _undefined_ max tuple length,
> >which is probably not desirable.

True.

> Boy, I'd sure find it desirable. There's nothing to stop people from
> using varchar(8000) or whatever if they want a predictable top limit.
> Text is not a standard type, and this wouldn't break standard semantics.
>
> lzText wasn't removed because folks thought it was useless, IIRC,
> it was removed because TOAST was an exciting and much more powerful
> approach and no one wanted to introduce a new type doomed to disappear
> after a single release cycle.

True.

> With TOAST, from the user's point of view you'll still have an
> _undefined_ max tuple length - the max will just be really, really
> large. Sure, the tuples will actually be fixed but large varying
> types can be split off into a series of tuples in the TOASTer
> oven, so to speak. So I guess I have difficulty understanding
> your argument.

False.

With TOAST, the maximum tuple length is limited by available
disk space (minus some overhead) and/or the number of bits we
use to represent the values original size and/or the size
addressable by the TOAST'ers table at all. Available space
allways limits the amount of data in a DB, and you allways
have to take some overhead into account, but calling this
_undefined_ isn't correct IMHO - better call it hard to
figure out.

The number of bits representing the attributes size is
another story, because we already decided to use some of the
top bits for special purposes, so a single attribute will
have some limit around 1/4 to 1 GB. Not too bad I think, who
would ever attempt to store a complete server backup in one
tuple? And which client/server combo will be able to handle
the required queries using the existing FE/BE protocol and
libpq implementation either. Thus there are other limits
causing problems before we need to continue this discussion,
surely.

> If text were implemented as lzText for a quick 7.1, which apparently
> was Jan's spin on the idea, then for 7.1 we'd say:

On the first look, it was a tempting solution. But there are
ton's of places in the backend, that assume text is binary
compatible to something or the bytes after the VARSIZE are
plain value bytes, not some compressed garbage to be passed
through a function first. Replacing TEXT by LZTEXT therefore
wouldn't be such an easy job, but would be working for the
wastebasked from the very beginning anyway, because TOAST
needs to revert it all again.

I don't like that kind of work.

Maybe I found some kind of compromise:

- We make LZTEXT a released type, without warning and anyone
can use it as needed.

- When featuring TOAST, we remove it and create a type
alias. This way, the "backend" will convert the table
schemas (WRT lztext->text) at reload time of the
conversion.

- We keep the type alias active past the next one or two
major releases. Someone skipping major releases,
converting from say 7.1 to 9.2, will have other problems
than replacing all occurences of lztext by text in his
dumps.

Actually I have some problems with the type coercion stuff.
There are functions lztext(text) and vice versa, but the
system is unable to find an "=" operator for lztext and text
when issuing

SELECT * FROM t1, t2, WHERE t1.lztext_att = t2.text_att;

This worked in the past releases (IIRC), so I wonder if the
failure above is a wanted "feature". I'll commit the stuff I
have tomorrow and hope someone can help me to get the
coercion working. All we have to do then is to tell in the
release notes and docs "Never use LZTEXT type name explicitly
in an application query (like for type casting) - use TEXT
instead".

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#========================================= wieck(at)debis(dot)com (Jan Wieck) #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2000-02-26 04:12:26 ALTER TABLE DROP COLUMN
Previous Message Don Baccus 2000-02-26 01:54:43 Re: [HACKERS] LZTEXT for rule plan stings