Re: hashing bpchar for nondeterministic collations is broken

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Re: hashing bpchar for nondeterministic collations is broken
Date: 2022-12-02 20:05:27
Message-ID: 7692740d4736e79032a5dac689cf2e304c03fa78.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Thu, 2022-12-01 at 12:11 -0800, Jeff Davis wrote:
> I don't see any major consequences

Yikes, it looks like this is a problem for BPCHAR (without typmod
specified):

create table p_bpchar(t bpchar collate ctest_nondet, i int)
partition by hash(t);
create table p0_bpchar partition of p_bpchar
for values with (modulus 4, remainder 0);
create table p1_bpchar partition of p_bpchar
for values with (modulus 4, remainder 1);
create table p2_bpchar partition of p_bpchar
for values with (modulus 4, remainder 2);
create table p3_bpchar partition of p_bpchar
for values with (modulus 4, remainder 3);

insert into p_bpchar values
('a', 0),
('a ', 1),
('a ', 2),
('a ', 3),
('a ', 4),
('a ', 5),
('a ', 6),
('a ', 7);

select count(*) from p0_bpchar; -- 2
select count(*) from p1_bpchar; -- 2
select count(*) from p2_bpchar; -- 3
select count(*) from p3_bpchar; -- 1

It seems like CHAR is not a problem, even though BPCHAR is documented
as an alias, because the planner treats BPCHAR->CHAR as a length
coercion, which trims trailing spaces.

And we just documented BPCHAR in v16 (0937f6d172), so the problem is
about to be worse. I suppose as of v15 we could argue that BPCHAR is
just an internal detail and that people shouldn't be creating columns
of that type?

--
Jeff Davis
PostgreSQL Contributor Team - AWS

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Jeff Davis 2022-12-02 21:21:08 Re: hashing bpchar for nondeterministic collations is broken
Previous Message David G. Johnston 2022-12-02 19:40:10 Re: Bug in jsonb_path_exists (maybe _match) one-element scalar/variable jsonpath handling