Re: Add ZSON extension to /contrib/

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, Andrew Dunstan <andrew(at)dunslane(dot)net>
Subject: Re: Add ZSON extension to /contrib/
Date: 2021-05-28 10:35:34
Message-ID: de0676cb-5289-de45-4567-4df003f3d6ce@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/26/21 6:43 PM, Matthias van de Meent wrote:
> On Wed, 26 May 2021 at 12:49, Aleksander Alekseev
> <aleksander(at)timescale(dot)com> wrote:
>>
>> Hi hackers,
>>
>> Many thanks for your feedback, I very much appreciate it!
>>
>>> If the extension is mature enough, why make it an extension in
>>> contrib, and not instead either enhance the existing jsonb type with
>>> it or make it a built-in type?
>>
>>> IMO we have too d*mn many JSON types already. If we can find a way
>>> to shoehorn this optimization into JSONB, that'd be great. Otherwise
>>> I do not think it's worth the added user confusion.
>>
>> Magnus, Tom,
>>
>> My reasoning is that if the problem can be solved with an extension
>> there is little reason to modify the core. This seems to be in the
>> spirit of PostgreSQL. If the community reaches the consensus to modify
>> the core to introduce a similar feature, we could discuss this as
>> well. It sounds like a lot of unnecessary work to me though (see
>> below).
>>
>>> * doesn't cover all cases, notably indexes.
>>
>> Tom,
>>
>> Not sure if I follow. What cases do you have in mind?
>>
>>> Do note that e.g. postgis is not in contrib, but is available in e.g. RDS.
>>
>> Matthias,
>>
>> Good point. I suspect that PostGIS is an exception though...
>
> Quite a few other non-/common/ extensions are available in RDS[0],
> some of which are HLL (from citusdata), pglogical (from 2ndQuadrant)
> and orafce (from Pavel Stehule, orafce et al.).
>
>>> I like the idea of the ZSON type, but I'm somewhat disappointed by its
>>> current limitations
>>
>> Several people suggested various enhancements right after learning
>> about ZSON. Time showed, however, that none of the real-world users
>> really need e.g. more than one common dictionary per database. I
>> suspect this is because no one has more than 2**16 repeatable unique
>> strings (one dictionary limitation) in their documents. Thus there is
>> no benefit in having separate dictionaries and corresponding extra
>> complexity.
>
> IMO the main benefit of having different dictionaries is that you
> could have a small dictionary for small and very structured JSONB
> fields (e.g. some time-series data), and a large one for large /
> unstructured JSONB fields, without having the significant performance
> impact of having that large and varied dictionary on the
> small&structured field. Although a binary search is log(n) and thus
> still quite cheap even for large dictionaries, the extra size is
> certainly not free, and you'll be touching more memory in the process.
>

I'm sure we can think of various other arguments for allowing separate
dictionaries. For example, what if you drop a column? With one huge
dictionary you're bound to keep the data forever. With per-column dicts
you can just drop the dict and free disk space / memory.

I also find it hard to believe that no one needs 2**16 strings. I mean,
65k is not that much, really. To give an example, I've been toying with
storing bitcoin blockchain in a database - one way to do that is storing
each block as a single JSONB document. But each "item" (eg. transaction)
is identified by a unique hash, so that means (tens of) thousands of
unique strings *per document*.

Yes, it's a bit silly and extreme, and maybe the compression would not
help much in this case. But it shows that 2**16 is damn easy to hit.

In other words, this seems like a nice example of survivor bias, where
we only look at cases for which the existing limitations are acceptable,
ignoring the (many) remaining cases eliminated by those limitations.

>>> - Each dictionary uses a lot of memory, regardless of the number of
>>> actual stored keys. For 32-bit systems the base usage of a dictionary
>>> without entries ((sizeof(Word) + sizeof(uint16)) * 2**16) would be
>>> almost 1MB, and for 64-bit it would be 1.7MB. That is significantly
>>> more than I'd want to install.
>>
>> You are probably right on this one, this part could be optimized. I
>> will address this if we agree on submitting the patch.
>>

I'm sure it can be optimized, but I also think it's focusing on the base
memory usage too much.

What I care about is the end result, i.e. how much disk space / memory I
save at the end. I don't care if it's 1MB or 1.7MB if using the
compression saves me e.g. 50% of disk space. And it's completely
irrelevant if I can't use the feature because of limitations stemming
from the "single dictionary" limitations (in which case I'll save the
0.7MB, but I also lose the 50% disk space savings - not a great trade
off, if you ask me).

>>> - You call gettimeofday() in both dict_get and in get_current_dict_id.
>>> These functions can be called in short and tight loops (for small GSON
>>> fields), in which case it would add significant overhead through the
>>> implied syscalls.
>>
>> I must admit, I'm not an expert in this area. My understanding is that
>> gettimeofday() is implemented as single virtual memory access on
>> modern operating systems, e.g. VDSO on Linux, thus it's very cheap.
>> I'm not that sure about other supported platforms though. Probably
>> worth investigating.
>
> Yes, but vDSO does not necessarily work on all systems: e.g. in 2017,
> a lot on EC2 [1] was run using Xen with vDSO not working for
> gettimeofday. I'm uncertain if this issue persists for their new
> KVM/Nitro hypervisor.
>

Yeah. Better not to call gettimeofday is very often. I have no idea why
the code even does that, though - it seems to be deciding whether it's
OK to use cached dictionary based on the timestamp, but that seems
rather dubious. But it's hard to say, because there are about no useful
comments *anywhere* in the code.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2021-05-28 10:43:30 Re: Add ZSON extension to /contrib/
Previous Message Daniel Gustafsson 2021-05-28 09:04:12 Re: Support for NSS as a libpq TLS backend