Re: bytea

From: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
To: Don Baccus <dhogaza(at)pacifier(dot)com>
Cc: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: bytea
Date: 2000-09-30 02:35:45
Message-ID: 200009300235.WAA03083@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

This brings up some good issues for the 7.2 release. Will large objects
become just an API on top of toast, or should they remain as a separate
physical storage format?

> At 08:30 PM 3/15/00 -0500, Bruce Momjian wrote:
>
> >Yes, we should keep it. I see now it is for purely binary data, while
> >text is for null-terminated strings.
>
> donb=# create table foo (b bytea);
> CREATE
> donb=# insert into foo values('ab\0cd');
> INSERT 107497 1
> donb=# select * from foo;
> b
> ----
> ab
> (1 row)
>
> donb=#
>
> Thus my comment "maybe they should be made to work" :)
>
> I don't know what's actually inside attr b, but the "cd" is at least
> dropped on output.
>
> For the BLOB hack I did for our toolkit I did the equivalent of
> uuencoding the input, which costs a predictable 4/3 expansion of
> the binary data (this is a segmented type, all done outside PG
> via SQL, triggers, and AOLserver driver magic but lets us stuff
> binary data such as photos etc, and pg_dump/restore them).
>
> If TOAST weren't on the way, I'd sit down and do a proper BLOB,
> as I explained to the folks on our web toolkit team lo is
> tantilizingly close to being useful for folks like us, without
> actually being useful.
>
> BLOBs should sit atop TOAST, though, and perhaps specialized I/O
> routines for a BLOB type could be made. Those for bytea could
> be changed, too, at risk of breaking existing code? But since
> bytea really acts like text perhaps there is no real existing code
> that exists that couldn't just operate on text instead, so there
> could be freedom to change it?
>
> For real binary data, uuencoded strings are a better choice for
> a printable output form that the text+\nnn form (since a high
> proportion of bytes will be emitted in the lengthy \nnn form).
>
> But normally with BLOB one would like a way to just stuff a file
> or data in a buffer into it, etc, much like current lo. The printable
> dump of data is mostly useful for pg_dump, IMO - a binary backup would
> remove the need for such a hack, too.
>
> Standard BLOBs provide a way to stuff segments into the db...
>
> BLOBs, as done by TOAST or my current segmented table hack used in
> our toolkit, only require a single table (or a single table per
> underlying user table in the case of TOAST) so don't clutter the
> way lo does.
>
> But lo allows each binary object to be 2GB in length.
>
> So they kind of fit different needs. lo seems fine for those who
> need really huge objects, and probably not a bazillion (since each
> generates a file + index). My hack, or TOAST which will be similar
> in table usage (both being segmented types in common tables), is
> good for binary data of moderately large size not to exceed 2GB
> in aggregate.
>
> Of course, with 64-bit systems on the horizon, the 2GB aggregate
> limit will slowly begin to disappear, too. 'Til then, providing
> a "real BLOB" while retaining lo for those who need single REALLY
> huge data objects would seem best.
>
>
>
>
>
> - Don Baccus, Portland OR <dhogaza(at)pacifier(dot)com>
> Nature photos, on-line guides, Pacific Northwest
> Rare Bird Alert Service and other goodies at
> http://donb.photo.net.
>

--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2000-09-30 02:37:30 Re: Suggested change in include/utils/elog.h
Previous Message Bruce Momjian 2000-09-30 02:32:29 Re: grant/revoke bug with delete/update