Re: Wrong results from inner-unique joins caused by collation mismatch

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Wrong results from inner-unique joins caused by collation mismatch
Date: 2026-05-05 02:06:58
Message-ID: CAMbWs4-7zLAdSv1AfBwff9gjXpFyPqGWqfvAZioUsEjnUxEZKA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 25, 2026 at 6:24 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> 0001 wrapped the logic in subroutine collations_are_compatible().

I don't think that name is good. It sounds like a general claim about
the two collations, but what the subroutine actually checks is much
narrower: whether the two collations agree on what counts as equal.
It has nothing to say about ordering, and two deterministic collations
agree on = but can disagree on <.

I renamed it to collations_agree_on_equality(), which seems a better
name to me. And then I committed this patch and back-patched it to
all supported branches.

> 0002 fixed query_is_distinct_for(), using that subroutine.

This patch changes the signature of query_is_distinct_for, which would
be an ABI break on stable branches. So in back-patches I added a
local function query_is_distinct_for_with_collations, which is a
collation-aware verson of query_is_distinct_for, and retained
query_is_distinct_for as a thin wrapper that calls that new local
function.

I also committed and back-patched this patch.

- Richard

In response to

Browse pgsql-hackers by date

  From Date Subject
Previous Message SATYANARAYANA NARLAPURAM 2026-05-05 01:11:11 [Patch] Omit virtual generated columns from test_decoding output