Re: Add ZSON extension to /contrib/

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add ZSON extension to /contrib/
Date: 2021-05-26 21:29:29
Message-ID: 20210526212929.GA3048@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 25, 2021 at 01:55:13PM +0300, Aleksander Alekseev wrote:
> Hi hackers,
>
> Back in 2016 while being at PostgresPro I developed the ZSON extension [1]. The
> extension introduces the new ZSON type, which is 100% compatible with JSONB but
> uses a shared dictionary of strings most frequently used in given JSONB
> documents for compression. These strings are replaced with integer IDs.
> Afterward, PGLZ (and now LZ4) applies if the document is large enough by common
> PostgreSQL logic. Under certain conditions (many large documents), this saves
> disk space, memory and increases the overall performance. More details can be
> found in README on GitHub.

I think this is interesting because it is one of the few cases that
allow compression outside of a single column. Here is a list of
compression options:

https://momjian.us/main/blogs/pgblog/2020.html#April_27_2020

1. single field
2. across rows in a single page
3. across rows in a single column
4. across all columns and rows in a table
5. across tables in a database
6. across databases

While standard Postgres does #1, ZSON allows 2-5, assuming the data is
in the ZSON data type. I think this cross-field compression has great
potential for cases where the data is not relational, or hasn't had time
to be structured relationally. It also opens questions of how to do
this cleanly in a relational system.

--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EDB https://enterprisedb.com

If only the physical world exists, free will is an illusion.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2021-05-26 21:35:30 Re: Incorrect usage of strtol, atoi for non-numeric junk inputs
Previous Message Greg Sabino Mullane 2021-05-26 21:23:55 Speed up pg_checksums in cases where checksum already set