Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: jurafejfar(at)gmail(dot)com, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Subject: Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Date: 2020-08-18 20:09:44
Message-ID: 1311836.1597781384@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

I wrote:
>> So I think what is happening here is that postgres_fdw's version of
>> IMPORT FOREIGN SCHEMA translates "COLLATE default" on the remote
>> server to "COLLATE default" on the local one, which of course is
>> a big fail if the defaults don't match. That allows the local
>> planner to believe that remote ORDER BYs on the two foreign tables
>> will give compatible results, causing the merge join to not work
>> very well at all.

Here's a full patch addressing this issue. I decided that the best
way to address the test-instability problem is to explicitly give
collations to all the foreign-table columns for which it matters
in the postgres_fdw test. (For portability's sake, that has to be
"C" or "POSIX"; I mostly used "C".) Aside from ensuring that the
test still passes with some other prevailing locale, this seems like
a good idea since we'll then be testing the case we are encouraging
users to use.

And indeed, it immediately turned up a new problem: if we explicitly
assign a collation to a foreign-table column c, the system won't
ship WHERE clauses as simple as "c = 'foo'" to the remote. This
surprised me, but the reason turned out to be that what postgres_fdw
is actually seeing is something like

{OPEXPR
:opno 98
:opfuncid 67
:opresulttype 16
:opretset false
:opcollid 0
:inputcollid 950
:args (
{VAR
:varno 6
:varattno 4
:vartype 25
:vartypmod -1
:varcollid 950
:varlevelsup 0
:varnosyn 6
:varattnosyn 4
:location 171
}
{RELABELTYPE
:arg
{CONST
:consttype 25
:consttypmod -1
:constcollid 100
:constlen -1
:constbyval false
:constisnull false
:location 341
:constvalue 9 [ 36 0 0 0 48 48 48 48 49 ]
}
:resulttype 25
:resulttypmod -1
:resultcollid 950
:relabelformat 2
:location -1
}
)
:location -1
}

that is, the constant is being explicitly relabeled with the correct
collation, and thus is_foreign_expr() thinks the collation shown by
the RelabelType node is an unsafely-introduced collation.

What I did about this was to change the recursion rule in
foreign_expr_walker() so that merging a safely-derived collation with
the same collation unsafely derived is considered safe. I think this
is all right, and it allows us to accept some cases that previously
were rejected as unsafe. But I might be missing something.

(BTW, there's an independent bug here, which is that we're getting
a tree of the above shape rather than a simple Const with the
appropriate collation; that is, this tree isn't fully const-folded.
This is a bug in canonicalize_ec_expression, which I'll go fix
separately. But it won't affect the problem at hand.)

This seems like a sufficiently large change in postgres_fdw's
behavior to require review, so I'll go add this to the next CF.

regards, tom lane

Attachment Content-Type Size
fix-postgres-fdw-collation-handling-1.patch text/x-diff 37.6 KB

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Jiří Fejfar 2020-08-19 05:39:36 Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Previous Message Tom Lane 2020-08-18 16:20:04 Re: BUG #16461: Segfault in autovacuum process

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-08-18 20:28:05 Re: pgsql: snapshot scalability: cache snapshots using a xact completion co
Previous Message Pavel Stehule 2020-08-18 18:04:24 Re: proposal: enhancing plpgsql debug API - returns text value of variable content