Re: Do we want a hashset type?

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Joel Jacobson <joel(at)compiler(dot)org>, jian he <jian(dot)universality(at)gmail(dot)com>
Cc: Tom Dunstan <pgsql(at)tomd(dot)cc>, Andrew Dunstan <andrew(at)dunslane(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Do we want a hashset type?
Date: 2023-06-20 12:10:17
Message-ID: 62242c6d-2f24-ea83-67e5-78a615cace93@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 6/20/23 12:59, Joel Jacobson wrote:
> On Mon, Jun 19, 2023, at 02:00, jian he wrote:
>> select hashset_contains('{1,2}'::int4hashset,NULL::int);
>> should return null?
>
> I agree, it should.
>
> I've now changed all functions except int4hashset() (the init function)
> and the aggregate functions to be STRICT.

I don't think this is correct / consistent with what we do elsewhere.
IMHO it's perfectly fine to have a hashset containing a NULL value,
because then it can affect results of membership checks.

Consider these IN / ANY queries:

test=# select 4 in (1,2,3);
?column?
----------
f
(1 row)

test=# select 4 = ANY(ARRAY[1,2,3]);
?column?
----------
f
(1 row)

now add a NULL:

test=# select 4 in (1,2,3,null);
?column?
----------

(1 row)

test=# select 4 = ANY(ARRAY[1,2,3,NULL]);
?column?
----------

(1 row)

I don't see why a (hash)set should behave any differently. It's true
arrays don't behave like this:

test=# select array[1,2,3,4,NULL] @> ARRAY[5];
?column?
----------
f
(1 row)

but I'd say that's more an anomaly than something we should replicate.

This is also what the SQL standard does for multisets - there's SQL:20nn
draft at http://www.wiscorp.com/SQLStandards.html, and the <member
predicate> section (p. 475) explains how this should work with NULL.

So if we see a set as a special case of multiset (with no duplicates),
then we have to handle NULLs this way too. It'd be weird to have this
behavior inconsistent.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-06-20 12:23:29 Re: Do we want a hashset type?
Previous Message Andrew Dunstan 2023-06-20 12:04:39 Re: run pgindent on a regular basis / scripted manner