From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Josh Berkus <josh(at)agliodbs(dot)com> |
Cc: | Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>, Peter Geoghegan <pg(at)heroku(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, Kevin Grittner <kgrittn(at)ymail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Larry White <ljw1001(at)gmail(dot)com> |
Subject: | Re: jsonb format is pessimal for toast compression |
Date: | 2014-08-14 23:02:08 |
Message-ID: | 25593.1408057328@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Josh Berkus <josh(at)agliodbs(dot)com> writes:
> So, here's a destruction test case:
> 200,000 JSON values (plus 2 key columns)
> Average width 4K (+/- 1K)
> 183 keys per JSON value
Is that 183 keys exactly each time, or is 183 the average?
If so, what's the min/max number of keys?
I ask because 183 would be below the threshold where I'd expect the
no-compression behavior to kick in.
> And, we see the effect:
> postgres=# select pg_size_pretty(pg_total_relation_size('jsonic'));
> pg_size_pretty
> ----------------
> 394 MB
> (1 row)
> postgres=# select pg_size_pretty(pg_total_relation_size('jsonbish'));
> pg_size_pretty
> ----------------
> 1147 MB
> (1 row)
> So, pretty bad; JSONB is 200% larger than JSON.
Ouch. But it's not clear how much of this is from the first_success_by
threshold and how much is from having poor compression even though we
escaped that trap.
> BTW, I find this peculiar:
> postgres=# select pg_size_pretty(pg_relation_size('jsonic'));
> pg_size_pretty
> ----------------
> 383 MB
> (1 row)
> postgres=# select pg_size_pretty(pg_relation_size('jsonbish'));
> pg_size_pretty
> ----------------
> 11 MB
> (1 row)
pg_relation_size is just the main data fork; it excludes TOAST.
So what we can conclude is that most of the data got toasted out-of-line
in jsonb, while very little did in json. That probably just comes from
the average datum size being close to the push-out-of-line threshold,
so that worse compression puts it over the edge.
It would be useful to see min/max/avg of pg_column_size() in both
these cases.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2014-08-14 23:02:46 | Re: Minmax indexes |
Previous Message | Josh Berkus | 2014-08-14 22:49:47 | Re: jsonb format is pessimal for toast compression |