Re: Regarding bytea column in Posgresql

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: John R Pierce <pierce(at)hogranch(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Regarding bytea column in Posgresql
Date: 2015-04-10 03:36:28
Message-ID: CAMsr+YHycF3KLuypOqSH4HDuJG4MaDeFhs5t-APRgr0x5tuVpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 10 April 2015 at 03:27, John R Pierce <pierce(at)hogranch(dot)com> wrote:

> one possible rationale for using BYTEA is that the data could be in
> various encodings, which the application wishes to preserve, and keeps
> track of somewhere else (perhaps in a field within the XML?).

Thanks for bringing this up, as it's a good reason to use bytea for XML.

XML actually has an encoding field in the DTD declaration, e.g.

<?xml version="1.0" encoding="UTF-8"?>

It is common - and of dubious correctness - for applications to store XML
in a 'text' or 'xml' field without changing the 'encoding' field in the
doctype to reflect the encoding at rest.

Personally I wish the 'xml' type in Pg knew how to change the encoding
declaration dynamically, but I know it's a hairy problem; e.g. if the
client_encoding is iso-8859-1, but the client then converts the XML
document to utf-8 internally, the encoding will be wrong if the client
doesn't change it back.

I've also run into XML documents that shove data in different encodings
into CDATA sections. This is wrong, of course, but apps sometimes do it
anyway.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Craig Ringer 2015-04-10 03:40:01 Re: Can a bdr enabled server belong to more than one bdr group?
Previous Message Craig Ringer 2015-04-10 03:27:03 Re: no pg_hba.conf entry for replication connection from host