Re: Bad Data back Door

From: "Albe Laurenz" <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "Tom Lane *EXTERN*" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David E(dot) Wheeler" <david(at)justatheory(dot)com>
Cc: "Pg Hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Bad Data back Door
Date: 2012-10-08 07:25:52
Message-ID: D960CB61B694CF459DCFB4B0128514C20886A6F4@exadv11.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
> "David E. Wheeler" <david(at)justatheory(dot)com> writes:
>> On Oct 5, 2012, at 6:12 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Now, having said that, I think it has to be the reponsibility of the
FDW
>>> to apply any required check ... which makes this a bug report
against
>>> oracle_fdw, not the core system. (FWIW, contrib/file_fdw depends on
the
>>> COPY code, which will check encoding.)

>> I agree that this is a bug in oracle_fdw (well, potentially;
ultimately,
>> it's Oracle that's lying about the encoding of those text values).
>> But I think that it would be much more useful overall -- not
>> to mention more database-like -- for PostgreSQL to provide a way to
>> enforce it. That is, to consider foreign tables to be an input like
>> COPY or SQL, and to validate values before displaying them.

> It is the FDW's responsibility to deal with this. We expect it to
hand
> back valid tuples; it is not reasonable to disassemble them looking
for
> mistakes (and we couldn't catch most mistakes, anyway). If the
> interface were defined in terms of text, we could do checking above
the
> FDW level ... but it isn't.

As the author I agree that this is a bug in oracle_fdw.

This was caused by ignorance on my part: I had assumed that the
type input functions would perform the necessary checks, but it
seems like that is not the case. I'll look into it.

Oracle does not care much about correct encoding.
If client character set and database character set are the same,
Oracle does not bother to check the data. This is probably how
WINDOWS-1252 characters slipped into the UTF-8 database in question.
I consider this a bug in Oracle, but never reported it, because
I don't have much hope that Oracle would see it as a problem
given their habitually sloppy handling of encoding issues.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David E. Wheeler 2012-10-08 07:30:49 Re: Bad Data back Door
Previous Message Simon Riggs 2012-10-08 07:12:12 Re: Rethinking placement of latch self-pipe initialization