Re: Repair cosmetic damage (done by pg_indent?)

From: daveg <daveg(at)sonic(dot)net>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Decibel! <decibel(at)decibel(dot)org>, pgsql-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Repair cosmetic damage (done by pg_indent?)
Date: 2007-08-04 21:09:58
Message-ID: 20070804210958.GB5770@sonic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

On Sat, Aug 04, 2007 at 09:04:33PM +0100, Gregory Stark wrote:
> "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:
>
> > Gregory Stark <stark(at)enterprisedb(dot)com> writes:
> >> The scenario I was describing was having, for example, 20 fields each
> >> of which are char(100) and store 'x' (which are padded with 99
> >> spaces). So the row is 2k but the fields are highly compressible, but
> >> shorter than the 256 byte minimum.
> >
> > To be blunt, the solution to problems like that is sending the DBA to a
> > re-education camp. I don't think we should invest huge amounts of
> > effort on something that's trivially fixed by using the correct datatype
> > instead of the wrong datatype.
>
> Sorry, there was a bit of a mixup here. The scenario I described above is what
> it would take to get Postgres to actually try to compress a small string given
> the way the toaster works.
>
> In the real world interesting cases wouldn't be so extreme. Having a single
> CHAR(n) or a text field which contains any other very compressible string
> could easily not be compressed currently due to being under 256 bytes.
>
> I think the richer target here is doing some kind of cross-record compression.
> For example, xml text columns often contain the same tags over and over again
> in successive records but any single datum wouldn't be compressible.

I have a table of (id serial primary key, url text unique) with a few
hundred million urls that average about 120 bytes each. The url index is
only used when a possibly new url is to be inserted, but between the data
and the index this table occupies a large part of the page cache. Any form
of compression here would be really helpful.

-dg

--
David Gould daveg(at)sonic(dot)net
If simplicity worked, the world would be overrun with insects.

In response to

Responses

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2007-08-04 21:31:08 Re: Document and/or remove unreachable code in tuptoaster.c from varvarlena patch
Previous Message Gregory Stark 2007-08-04 20:04:33 Re: Repair cosmetic damage (done by pg_indent?)