Re: jsonb format is pessimal for toast compression

From: Arthur Silva <arthurprs(at)gmail(dot)com>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: "David E(dot) Wheeler" <david(at)justatheory(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Jan Wieck <jan(at)wi3ck(dot)info>
Subject: Re: jsonb format is pessimal for toast compression
Date: 2014-09-16 01:27:07
Message-ID: CAO_YK0U1FxvhhLQGCk_WzueBduzCQU_UfzngKDNZ3k6_Wue1fQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I couldn't get my hands on the twitter data but I'm generating my own. The
json template is http://paste2.org/wJ1dfcjw and data was generated with
http://www.json-generator.com/. It has 35 top level keys, just in case
someone is wondering.
I generated 10000 random objects and I'm inserting them repeatedly until I
got 320k rows.

Test query: SELECT data->>'name', data->>'email' FROM t_json
Test storage: EXTERNAL
Test jsonb lengths quartiles: {1278,1587,1731,1871,2231}
Tom's lengths+cache aware: 455ms
HEAD: 440ms

This is a realistic-ish workload in my opinion and Tom's patch performs
within 4% of HEAD.

Due to the overall lenghts I couldn't really test compressibility so I
re-ran the test. This time I inserted an array of 2 objects in each row, as
in: [obj, obj];
The objects where taken in sequence from the 10000 pool so contents match
in both tests.

Test query: SELECT data #> '{0, name}', data #> '{0, email}', data #> '{1,
name}', data #> '{1, email}' FROM t_json
Test storage: EXTENDED
HEAD: 17mb table + 878mb toast
HEAD size quartiles: {2015,2500,2591,2711,3483}
HEAD query runtime: 15s
Tom's: 220mb table + 580mb toast
Tom's size quartiles: {1665,1984,2061,2142.25,2384}
Tom's query runtime: 13s

This is an intriguing edge case that Tom's patch actually outperform the
base implementation for 3~4kb jsons.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2014-09-16 01:46:04 Re: jsonb format is pessimal for toast compression
Previous Message Peter Geoghegan 2014-09-16 00:18:39 Re: B-Tree support function number 3 (strxfrm() optimization)