From: | "Alex Hunsaker" <badalex(at)gmail(dot)com> |
---|---|
To: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | "Stephen R(dot) van den Berg" <srb(at)cuci(dot)nl>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Significantly larger toast tables on 8.4? |
Date: | 2009-01-02 18:56:47 |
Message-ID: | 34d269d40901021056p22c82cadl3eb62ac57fa3766b@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 2, 2009 at 09:01, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>The argument was that compressing/decompressing such large chunks would
>require a lot of CPU effort; also it would defeat attempts to fetch
>subsections of a large string. In the past we've required people to
>explicitly "ALTER TABLE SET STORAGE external" if they wanted to make
>use of the substring-fetch optimization, but it was argued that this
>would make that more likely to work automatically.
It is *way* faster here are some ruff numbers:
$ cat q.sql
select length(substring(debug, 1024, 1024)) from dies where die_id = 295;
$ ./pgbench -T60 -n -f q.sql
8.4: 1532.327546
8.3: 21.295657
Thats with 8.4 as a --enable-debug --enable-cassert build! (8.3 just
has --enable-debug)
> I'm not entirely convinced by Alex' analysis anyway; the only way
> those 39 large values explain the size difference is if they are
> *tremendously* compressible, like almost all zeroes. The toast
> compressor isn't so bright that it's likely to get 10X compression
> on typical data.
Well its certainly not all zeros, but those big rows are the same
message repeated (~5k) until perl detected endless recursion died...
So its not exactly typical or *useful* data either. The other 51,
971 rows (rows smaller than 1024*1024 bytes) account for 174MB while
those 39 account for the other 828MB.
Mainly I was just alarmed by the size increase. But I think the
pgbench numbers make me happy even if I never do a substring on that
data...
------------------
# 8.3
select pg_size_pretty(pg_column_size(debug)) from dies where die_id = 295;
pg_size_pretty
----------------
1256 kB
select pg_size_pretty(pg_column_size(debug)) from dies where die_id = 295;
pg_size_pretty
----------------
22 MB
select pg_size_pretty(sum(row_size)) from
( select
coalesce(pg_column_size(action), 0) +
coalesce(pg_column_size(die_id), 0) +
coalesce(pg_column_size(cparam), 0) +
coalesce(pg_column_size(date_created), 0) +
coalesce(pg_column_size(db_profile), 0) +
coalesce(pg_column_size(debug), 0) +
coalesce(pg_column_size(defunct), 0) +
coalesce(pg_column_size(env), 0) +
coalesce(pg_column_size(login), 0) +
coalesce(pg_column_size(msg), 0) +
coalesce(pg_column_size(open_user_id), 0) +
coalesce(pg_column_size(page_load_id), 0) +
coalesce(pg_column_size(session_id), 0) +
coalesce(pg_column_size(state), 0) +
coalesce(pg_column_size(state_action), 0) +
coalesce(pg_column_size(user_id), 0) +
coalesce(pg_column_size(whoops), 0) as row_size
from dies ) as foo where foo.row_size < 1024*1024;
pg_size_pretty
----------------
174 MB
select pg_size_pretty(sum(row_size)) from
( select
coalesce(pg_column_size(action), 0) +
coalesce(pg_column_size(die_id), 0) +
coalesce(pg_column_size(cparam), 0) +
coalesce(pg_column_size(date_created), 0) +
coalesce(pg_column_size(db_profile), 0) +
coalesce(pg_column_size(debug), 0) +
coalesce(pg_column_size(defunct), 0) +
coalesce(pg_column_size(env), 0) +
coalesce(pg_column_size(login), 0) +
coalesce(pg_column_size(msg), 0) +
coalesce(pg_column_size(open_user_id), 0) +
coalesce(pg_column_size(page_load_id), 0) +
coalesce(pg_column_size(session_id), 0) +
coalesce(pg_column_size(state), 0) +
coalesce(pg_column_size(state_action), 0) +
coalesce(pg_column_size(user_id), 0) +
coalesce(pg_column_size(whoops), 0) as row_size
from dies ) as foo where foo.row_size < 1024*1024;
pg_size_pretty
----------------
828 MB
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2009-01-02 19:20:17 | Re: Documenting serializable vs snapshot isolation levels |
Previous Message | Tom Lane | 2009-01-02 18:47:51 | Re: Documenting serializable vs snapshot isolation levels |