MS ASCII characters in text field

From: "Kevin McCarthy" <kemccarthy1(at)gmail(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: MS ASCII characters in text field
Date: 2007-03-26 15:13:46
Message-ID: 4178da10703260813u3ba476a2x59e296aad27d7d7c@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

I'm running into a problem that, from my investigative work online, seems to
be more common than I'd suspected. We are hosting a site using Apache and
PHP5 that allows uploading of textual content into field tables via HTML
forms processed by code that inserts the text into text fields in various
tables.

Often users will copy and paste text directly from MS Word docs into the
forms which will invariably contain Microsoft's proprietary formatting of
characters such as 'smart' quotes, trademark, copyright symbols, accent
grave, etc. We've set the HTML pages as UTF-8 and the database connection to
UTF-8. However when our calls to import the data that includes any of these
characters into the database, the queries fail complaining that e.g.
"[nativecode=ERROR: character 0xe28093 of encoding "UTF8" has no equivalent
in "LATIN9"]"

We've tried on the PHP end to translate various ASCII characters from
literal values to specified replacements but have not been able to catch
these anomalies. Any suggestions, recommendations, experiences to relate?

TIA

--
Kevin McCarthy
kemccarthy1(at)gmail(dot)com

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Tom Lane 2007-03-26 18:01:54 Re: MS ASCII characters in text field
Previous Message Demetrakopoulos Yiannis 2007-03-26 15:03:29 VB6 and postgresql