From: | "Jeffrey W(dot) Baker" <jwbaker(at)acm(dot)org> |
---|---|
To: | Jan Wieck <janwieck(at)yahoo(dot)com> |
Cc: | Postgres general mailing list <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: more about pg_toast growth |
Date: | 2002-03-13 20:35:55 |
Message-ID: | 1016051755.5255.26.camel@heat |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Wed, 2002-03-13 at 12:16, Jan Wieck wrote:
> Jeffrey W. Baker wrote:
> > On Wed, 2002-03-13 at 07:22, Jan Wieck wrote:
> > > [...]
> > >
> > > Remember, TOAST doesn't only come in slices, don't you
> > > usually brown it? Meaning, the data gets compressed (with a
> > > lousy but really fast algorithm). What kind of data is
> > > resp_body? 50% compression ratio ... I guess it's html,
> > > right?
> >
> > It is gzipped and base64-encoded text. It's somewhat strange that a
> > fast LZ would deflate it very much, but I guess it must be an artifact
> > of the base64. The initial gzip tends to deflate the data by about 90%.
>
> Now THAT is very surprising to me! The SLZ algorithm used in
> TOAST will for sure not be able to squeeze anything out of a
> gzip compressed stream. The result would be bigger again.
> B64 changes the file size basically to 4/3rd, but since the
> input stream is gzipped, the resulting B64 stream shouldn't
> contain patterns that SLZ can use to reduce the size again.
>
> Are you sure you're B64-encoding the gzipped text?
I am positive:
rupert=# select substr(body, 0, 200) from resp_body where resp = (select
max(resp) from resp_body);
eJztfXt34riy799hrf4OGuZMJ1k3BL949SScRQhJmCbAAbp7z75zV5bAAjxtbI5tkjB75rvfkiwb
GxxDHt0dgvtBjC2VpFLVr6qkknMydiZ6+WRMsFo+6dV7jVqZnOE5ami2oxkjG31ALWdMLLgxIIZN
UFvHDrFPsm7Z1MmEOBiNHWeaIf87025P07X7qWYRO40Gp
rupert=# select min(length(body)), max(length(body)), avg(length(body))
from resp_body;
min | max | avg
-----+--------+------------------
0 | 261948 | 21529.5282897281
> I mean,
> you have an average body size of 23K "gzipped", so you're
> telling that the average uncompressed body size is about
> 230K? You are storing 230 Megabytes of raw body data per
> hour? Man, who is writing all that text?
Reuters.
I have increased the free space map and will be able to restart the
postmaster today at around midnight GMT.
Thanks for you help,
Jeffrey
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2002-03-13 20:51:43 | Re: index on large table |
Previous Message | Bruce Momjian | 2002-03-13 20:29:37 | Re: checkpoint |