Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Neil Chen <carpenter(dot)nail(dot)cz(at)gmail(dot)com>
Subject: Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Date: 2021-09-25 13:55:29
Message-ID: CAPmGK16g4qDfCqL7hAAdBpkCjz62gA1FSBpKtH=Bfnm6L=qkfA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Sat, Sep 25, 2021 at 4:11 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> writes:
> > One thing I noticed is that collatable operators/functions sent to the
> > remote might also cause an unexpected result when the default
> > collations are not compatible. Consider this example (even with your
> > patch):
> > ...
> > where ft1 is a foreign table with an integer column c1. As shown
> > above, the sort using the collatable function chr() is performed
> > remotely, so the select query might produce the result in an
> > unexpected sort order when the default collations are not compatible.
>
> I don't think there's anything really new there --- it's still assuming
> that COLLATE "default" means the same locally and remotely.

I thought that the example showed that we would need to specify a
collation per-operation, not only per-foreign-table-column, like
“ORDER BY chr(c1) COLLATE “foo”” where “foo” is the actual name of a
local collation matching the local server’s default collation, when
the default collation doesn’t match the remote server’s default
collation, to avoid pushing down operations incorrectly as in the
example.

> As a short-term answer, I propose that we apply (and back-patch) the
> attached documentation changes.

The attached patch looks good to me.

> Longer-term, it seems like we really have to be able to represent
> the notion of a remote column that has an "unknown" collation (that
> is, one that doesn't match any local collation, or at least is not
> known to do so).

+1

> A rough sketch for making this happen is:
>
> 1. Create a built-in "unknown" entry in pg_collation. Insert some
> hack or other to prevent this from being applied to any real, local
> column; but allow foreign-table columns to have it.
>
> 2. Apply mods, probably fairly similar to my patch, that prevent
> postgres_fdw from believing that "unknown" matches any local
> collation. (Hm, actually maybe no special code change will be
> needed here, once "unknown" has its own OID?)
>
> 3. Change postgresImportForeignSchema so that it can substitute
> the "unknown" collation at need. The exact rules for this could
> be debated depending on whether you'd rather prioritize safety or
> ease-of-use, but I think at least we should use "unknown" whenever
> import_collate is turned off. Perhaps there should be an option
> to substitute it for remote "default" as well. (Further down the
> road, perhaps that could be generalized to allow a user-controlled
> mapping from remote to local collations.)

In addition, a) we should detect whether local “default” matches
remote “default”, and b) if not, we should prevent pushing down
sort/comparison operations using collatable functions/operators like
“ORDER BY chr(c1)” in the example (and pushing down those operations
on foreign-table columns labeled with “COLLATE default” if such
labeling is allowed)?

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2021-09-25 13:59:33 Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Previous Message Tom Lane 2021-09-24 19:11:33 Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-09-25 13:59:33 Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Previous Message Tom Lane 2021-09-25 13:45:29 Re: OpenSSL 3.0.0 compatibility