From: | "Joel Jacobson" <joel(at)compiler(dot)org> |
---|---|
To: | "Tomas Vondra" <tomas(dot)vondra(at)enterprisedb(dot)com>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, "jian he" <jian(dot)universality(at)gmail(dot)com> |
Cc: | "Tom Dunstan" <pgsql(at)tomd(dot)cc>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Do we want a hashset type? |
Date: | 2023-06-19 11:33:35 |
Message-ID: | 6e9d18cc-e09a-4933-853a-68ffe0653d0b@app.fastmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jun 19, 2023, at 11:21, Tomas Vondra wrote:
> AFAICS the standard only defines arrays and multisets. Arrays are pretty
> much the thing we have, including the ARRAY[] constructor etc. Multisets
> are similar to hashset discussed here, except that it tracks the number
> of elements for each value (which would be trivial in hashset).
>
> So if we want to make this a built-in feature, maybe we should aim to do
> the multiset thing, with the standard SQL syntax? Extending the grammar
> should not be hard, I think. I'm not sure of the underlying code
> (ArrayType, ARRAY_SUBLINK stuff, etc.) we could reuse or if we'd need a
> lot of separate code doing that.
Multisets handle duplicates uniquely, this may bring unexpected issues. Sets
and multisets have distinct utility in C++, Rust, Java, etc. However, sets are
more fundamental and prevalent in std libs than multisets.
Despite SQL's multiset possibility, a distinct hashset type is my preference,
helping appropriate data structure choice and reducing misuse.
The necessity of multisets is vague beyond standards compliance.
/Joel
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2023-06-19 11:43:37 | Re: Assert while autovacuum was executing |
Previous Message | Jelte Fennema | 2023-06-19 10:52:48 | Re: [EXTERNAL] Re: Add non-blocking version of PQcancel |