Re: Do we want a hashset type?

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "Tomas Vondra" <tomas(dot)vondra(at)enterprisedb(dot)com>, "jian he" <jian(dot)universality(at)gmail(dot)com>
Cc: "Tom Dunstan" <pgsql(at)tomd(dot)cc>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Do we want a hashset type?
Date: 2023-06-20 14:56:12
Message-ID: 3b9c7e39-e5be-488f-8e86-0cab533b2dc6@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jun 20, 2023, at 14:10, Tomas Vondra wrote:
> On 6/20/23 12:59, Joel Jacobson wrote:
>> On Mon, Jun 19, 2023, at 02:00, jian he wrote:
>>> select hashset_contains('{1,2}'::int4hashset,NULL::int);
>>> should return null?
>>
>> I agree, it should.
>>
>> I've now changed all functions except int4hashset() (the init function)
>> and the aggregate functions to be STRICT.
>
> I don't think this is correct / consistent with what we do elsewhere.
> IMHO it's perfectly fine to have a hashset containing a NULL value,

The reference to consistency with what we do elsewhere might not be entirely
applicable in this context, since the set feature we're designing is a new beast
in the SQL landscape.

I think adhering to the theoretical purity of sets by excluding NULLs aligns us
with set theory, simplifies our code, and parallels set implementations in other
languages.

I think we have an opportunity here to innovate and potentially influence a
future set concept in the SQL standard.

However, I see how one could argue against this reasoning, on the basis that
PostgreSQL users might be more familiar with and expect NULLs can exist
everywhere in all data structures.

A different perspective is to look at what use-cases we can foresee.

I've been trying hard, but I can't find compelling use-cases where a NULL element
in a set would offer a more natural SQL query than handling NULLs within SQL and
keeping the set NULL-free.

Does anyone else have a strong realistic example where including NULLs in the
set would simplify the SQL query?

/Joel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2023-06-20 15:38:10 Re: pgindent vs. pgperltidy command-line arguments
Previous Message torikoshia 2023-06-20 13:27:36 Re: Allow pg_archivecleanup to remove backup history files