Quick Links

Re: Duplicate primary key when primary key is varchar

From:	Oliver Jowett <oliver(at)opencloud(dot)com>
To:	Tính Trương Xuân <tinh(dot)truong(at)evolus(dot)vn>
Cc:	pgsql-jdbc(at)postgresql(dot)org
Subject:	Re: Duplicate primary key when primary key is varchar
Date:	2010-10-06 11:51:48
Message-ID:	4CAC62D4.2020800@opencloud.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-jdbc

Tính Trương Xuân wrote:

> In the State table, you will notice that there are two rows with the
> primary key is AL (it's capital A and capital L). But when you select:
> select * from state where code='AL'
> There is only one row returned. You may guest that there are some spaces
> in the primary key? Nope!
> You can verify by this query:
> select 'a'||code||'b' from state
> You will see that there are two rows with the data as aALb. No space in
> between!

You do have a space in one of the rows - it's a ZERO WIDTH NO-BREAK
SPACE (U+FEFF), also used as a byte order mark in UTF-16. Here's the
relevant bit courtesy of hexdump -C:

Note the "0a ef bb bf 41 4c" - that's a linefeed, followed by the UTF-8
encoding of U+FEFF, followed by "AL"

"less -U" renders it like this:

> COPY state (code, name, status) FROM stdin;
> <U+FEFF>AL^IALABAMA^IACTIVE
> AK^IALASKA^IACTIVE

I guess that your UTF-16 input starts with a BOM, and you're just
passing it through unchanged. Perhaps you are explicitly claiming to
your parser that the input is UTF-16LE, which shouldn't start with a
BOM, so your parser is interpreting it as a no-break space rather than
filtering it out?

Wikipedia says "For the IANA registered charsets UTF-16BE and UTF-16LE,
a byte order mark must not be used because the names of these character
sets already determine the byte order. If encountered, an initial U+FEFF
must be interpreted as a (deprecated) "zero width no-break space".'

Oliver

In response to

Duplicate primary key when primary key is varchar at 2010-10-06 10:16:54 from Tính Trương Xuân

Browse pgsql-jdbc by date

	From	Date	Subject
Next Message	louis	2010-10-06 11:57:02	The logic behind the prepared statement in PostgreSQL
Previous Message	louis	2010-10-06 11:41:42	The logic behind the prepared statement in PostgreSQL