Re: Reproducible coliisions in jsonb_hash

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Valeriy Meleshkin <valeriy(at)meleshk(dot)in>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reproducible coliisions in jsonb_hash
Date: 2022-05-15 16:03:26
Message-ID: 20220515160326.GB9030@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > Here, that doesn't seem too likely. You could have a column that
> > contains 'tom' and ['tom'] and [['tom']] and [[['tom']]] and so forth
> > and they all get mapped onto the same bucket and you're sad. But
> > probably not.
>
> Yeah, that might be a more useful way to think about it: is this likely
> to cause performance-critical collisions in practice? I agree that
> that doesn't seem like a very likely situation, even given that you
> might be using json for erratically-structured data.

Particularly for something like jsonb (but maybe other things?) having a
hash function that could be user-defined or at least have some options
seems like it would be quite nice (similar to compression...). If we
were to go in the direction of changing this, I'd suggest that we try to
make it something where the existing function could still be used while
also allowing a new one to be used. More flexibility would be even
better, of course (column-specific hash functions comes to mind...).

Agreed with the general conclusion here also, just wanted to share some
thoughts on possible future directions to go in.

Thanks,

Stephen

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2022-05-15 19:55:04 [RFC] Improving multi-column filter cardinality estimation using MCVs and HyperLogLog
Previous Message Stephen Frost 2022-05-15 15:56:35 Re: gitmaster access