Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Neil Chen <carpenter(dot)nail(dot)cz(at)gmail(dot)com>
Subject: Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Date: 2021-09-09 15:45:32
Message-ID: CAPmGK15ohpwdAPYA=Fw+C+DTdoPMWrpr6_q_XR6u3eJH7MSkVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

On Thu, Sep 2, 2021 at 5:42 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> The real reason that this hasn't gotten committed is that I remain
> pretty uncomfortable about whether it's an acceptable solution to
> the problem. Suddenly asking people to plaster COLLATE clauses
> on all their textual remote columns seems like a big compatibility
> gotcha.

I think so too. I reviewed the patch:

/*
* If the Var is from the foreign table, we consider its
- * collation (if any) safe to use. If it is from another
+ * collation (if any) safe to use, *unless* it's
+ * DEFAULT_COLLATION_OID. We treat that as meaning "we don't
+ * know which collation this is". If it is from another
* table, we treat its collation the same way as we would a
* Param's collation, ie it's not safe for it to have a
* non-default collation.
@@ -350,7 +352,12 @@ foreign_expr_walker(Node *node,

/* Else check the collation */
collation = var->varcollid;
- state = OidIsValid(collation) ? FDW_COLLATE_SAFE :
FDW_COLLATE_NONE;
+ if (collation == InvalidOid)
+ state = FDW_COLLATE_NONE;
+ else if (collation == DEFAULT_COLLATION_OID)
+ state = FDW_COLLATE_UNSAFE;
+ else
+ state = FDW_COLLATE_SAFE;

One thing I noticed about this change is:

explain (verbose, costs off) select * from ft3 order by f2;
QUERY PLAN
---------------------------------------------------------
Sort
Output: f1, f2, f3
Sort Key: ft3.f2
-> Foreign Scan on public.ft3
Output: f1, f2, f3
Remote SQL: SELECT f1, f2, f3 FROM public.loct3
(6 rows)

where ft3 is defined as in the postgres_fdw regression test (see the
section “test handling of collations”). For this query, the sort is
done locally, but I think it should be done remotely, or an error
should be raised, as we don’t know the collation assigned to the
column “f2”. So I think we need to do something about this.

Having said that, I think another option for this would be to left the
code as-is; assume that 1) the foreign var has "COLLATE default”, not
an unknown collation, when labeled with "COLLATE default”, and 2)
"COLLATE default” on the local database matches "COLLATE default” on
the remote database. This would be the same as before, so we could
avoid the concern mentioned above. I agree with the
postgresImportForeignSchema() change, except creating a local column
with "COLLATE default" silently if that function can’t find a remote
collation matching the database's datcollate/datctype when seeing
"COLLATE default”, in which case I think an error should be raised to
prompt the user to check the settings for the remote server and/or
define foreign tables manually with collations that match the remote
side. Maybe I’m missing something, though.

Anyway, here is a patch created on top of your patch to modify
async-related test cases to work as intended. I’m also attaching your
patch to make the cfbot quiet.

Sorry for the delay.

Best regards,
Etsuro Fujita

Attachment Content-Type Size
0001-fix-postgres-fdw-collation-handling-4.patch application/octet-stream 41.5 KB
0002-modify-postgres-fdw-async-test-cases.patch application/octet-stream 7.4 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2021-09-09 16:00:21 Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails
Previous Message Tomas Barton 2021-09-09 12:50:00 Segmentation fault on postgresql 13.4, 12.8 with function call in a cursor

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2021-09-09 15:52:37 Re: Why does bootstrap and later initdb stages happen via client?
Previous Message Jaime Casanova 2021-09-09 15:38:33 Re: Asymmetric partition-wise JOIN