Re: [PATCH] Compression dictionaries for JSONB

From: Aleksander Alekseev <aleksander(at)timescale(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Cc: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Nikita Malakhov <hukutoc(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: [PATCH] Compression dictionaries for JSONB
Date: 2022-08-01 11:25:32
Message-ID: CAJ7c6TOx9NrLpu1SS9p2osY+5-6s=PTm_iFAQCpfgHr9H2hFWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

> So far we seem to have a consensus to:
>
> 1. Use bytea instead of NameData to store dictionary entries;
>
> 2. Assign monotonically ascending IDs to the entries instead of using
> Oids, as it is done with pg_class.relnatts. In order to do this we
> should either add a corresponding column to pg_type, or add a new
> catalog table, e.g. pg_dict_meta. Personally I don't have a strong
> opinion on what is better. Thoughts?
>
> Both changes should be straightforward to implement and also are a
> good exercise to newcomers.
>
> I invite anyone interested to join this effort as a co-author! (since,
> honestly, rewriting the same feature over and over again alone is
> quite boring :D).

cfbot complained that v5 doesn't apply anymore. Here is the rebased
version of the patch.

> Good point. This was not a problem for ZSON since the dictionary size
> was limited to 2**16 entries, the dictionary was immutable, and the
> dictionaries had versions. For compression dictionaries we removed the
> 2**16 entries limit and also decided to get rid of versions. The idea
> was that you can simply continue adding new entries, but no one
> thought about the fact that this will consume the memory required to
> decompress the document indefinitely.
>
> Maybe we should return to the idea of limited dictionary size and
> versions. Objections?
> [ ...]
> You are right. Another reason to return to the idea of dictionary versions.

Since no one objected so far and/or proposed a better idea I assume
this can be added to the list of TODOs as well.

--
Best regards,
Aleksander Alekseev

Attachment Content-Type Size
v6-0001-Compression-dictionaries-for-JSONB.patch application/octet-stream 48.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2022-08-01 11:57:01 Re: making relfilenodes 56 bits
Previous Message Pavel Borisov 2022-08-01 10:30:00 Re: Trying to add more tests to gistbuild.c