From: | Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> |
---|---|
To: | Joel Jacobson <joel(at)compiler(dot)org> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Isaac Morland <isaac(dot)morland(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com> |
Subject: | Re: [PATCH] Support empty ranges with bounds information |
Date: | 2021-03-02 20:40:17 |
Message-ID: | 1C23E6FC-9EDD-43DC-8C00-A753236F5F5D@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On Mar 2, 2021, at 12:04 PM, Joel Jacobson <joel(at)compiler(dot)org> wrote:
>
> On Tue, Mar 2, 2021, at 20:57, Mark Dilger wrote:
>> I didn't phrase that clearly enough. I'm thinking about whether you include the bounds information in the hash function. The current implementation of hash_range(PG_FUNCTION_ARGS) is going to hash the lower and upper bounds, since you didn't change it to do otherwise, so "equal" values won't always hash the same. I haven't tested this out, but it seems you could get a different set of rows depending on whether the planner selects a hash join.
>
> I think this issue is solved by the empty-ranges-with-bounds-information-v2.patch I just sent,
> since with it, there are no semantic changes at all, lower() and upper() works like before.
There are semantic differences, because hash_range() doesn't call lower() and upper(), it uses RANGE_HAS_LBOUND and RANGE_HAS_UBOUND, the behavior of which you have changed. I created a regression test and expected results and checked after applying your patch, and your patch breaks the hash function behavior. Notice that before your patch, all three ranges hashed to the same value, but not so after:
@@ -1,18 +1,18 @@
select hash_range('[a,a)'::textrange);
hash_range
------------
- 484847245
+ -590102690
(1 row)
select hash_range('[b,b)'::textrange);
hash_range
------------
- 484847245
+ 281562732
(1 row)
select hash_range('[c,c)'::textrange);
- hash_range
-------------
- 484847245
+ hash_range
+-------------
+ -1887445565
(1 row)
You might change how hash_range() works to get all "equal" values to hash the same value, but that just gets back to the problem that non-equal things appear to be equal. I guess that's your two-warty-feet preference, but not everyone is going to be in agreement on that.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Joel Jacobson | 2021-03-02 20:51:29 | Re: [PATCH] Support empty ranges with bounds information |
Previous Message | Robert Haas | 2021-03-02 20:39:29 | Re: new heapcheck contrib module |