Re: JSONB spaces in text presentation

From: "Ilya I(dot) Ashchepkov" <koctep(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: John R Pierce <pierce(at)hogranch(dot)com>, pgsql <pgsql-general(at)postgresql(dot)org>
Subject: Re: JSONB spaces in text presentation
Date: 2014-09-24 16:23:48
Message-ID: CALgmQD_hcSsHPukyyWo7T1wm6ty82FCGGCOSqZwvLSexE5OZPQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

IMHO, prettification is useful only for debugging.
It would be nice to have a session variable for the debug output with
spaces, new lines and indentation.

On Wed, Sep 24, 2014 at 8:44 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:

> On Wed, Sep 24, 2014 at 2:44 AM, Ilya I. Ashchepkov <koctep(at)gmail(dot)com>
> wrote:
> > I'm sorry about sending email several times. I haven't understand, was it
> > sent by gmail or not.
> >
> >
> > On Wed, Sep 24, 2014 at 2:30 PM, John R Pierce <pierce(at)hogranch(dot)com>
> wrote:
> >>
> >> On 9/24/2014 12:23 AM, Ilya I. Ashchepkov wrote:
> >>>
> >>>
> >>> Is spaces is necessary in text presentation of JSONB?
> >>> In my data resulting text contains ~12% of spaces.
> >>
> >>
> >> can you show us an example of this?
> >
> >
> > One record
> > # select data from events.data limit 1;
> > {"can": {"lls": {"1": 76.4}, "mhs": 4674.85, "rpm": 168.888, "speed": 74,
> > "runned": 166855895, "fuel_consumption": 74213.5}, "crc": 10084, "gps":
> 1,
> > "gsm": {"signal": 100}, "lls": {"1": 733, "2": 717}, "used": 19, "speed":
> > 87.4, "valid": 1, "msg_id": 89, "runned": 72.75, "boot_no": 256,
> "digital":
> > {"in": {"1": 1, "2": 0, "3": 0, "4": 0, "5": 0, "6": 0}, "out": {"1": 0,
> > "2": 0}}, "visible": 20, "ignition": 1, "location": {"course": 265,
> > "altitude": 143, "latitude": 55.127888997395836, "longitude":
> > 80.8046142578125}, "protocol": 4, "coldstart": 1, "timesource":
> "terminal",
> > "receiver_on": 1, "external_power": 28.07, "internal_power": 4.19}
> >
> > Whitespacis percents in this record:
> > # select array_length(regexp_split_to_array(data::text, text ' '),
> > 1)*100./length(data::text) from events.data limit 1;
> > ?column?
> > ---------------------
> > 12.3417721518987342
> >
> > Whitespace in test data
> > # select count(*),avg(array_length(regexp_split_to_array(data::text,
> text '
> > '), 1)*100./length(data::text)) from events.data ;
> > count | avg
> > --------+---------------------
> > 242222 | 12.3649234646118312
>
>
> For jsonb (unlike json), data is not actually stored as json but in a
> binary format. It will generally be much larger than the text
> representation in fact but in exchange for that many operations will
> be faster. The spaces you see are generated when the jsonb type is
> converted to text for output. I actually think it's pretty reasonable
> to want to redact all spaces from such objects in all cases where
> converstion to text happens (output functions, xxxto_json, etc)
> because ~12% savings are nothing to sneeze at when moving large
> documents in and out of the database.
>
> On the flip side, a more verbose prettification would be pretty nice
> too. I wonder if a hypothetical GUC is the best way to control this
> behavior...
>
> merlin
>

--
С уважением,
Ащепков Илья koctep(at)gmail(dot)com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2014-09-24 16:27:34 Re: Question about row_number() ordering semantics
Previous Message Daniel Lenski 2014-09-24 16:04:21 Why can't I select un-grouped columns when grouping by a (non-primary) unique key?