| From: | "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> |
|---|---|
| To: | Erik Wienhold <ewie(at)ewie(dot)name> |
| Cc: | Hoda Salim <hoda(dot)s(dot)salim(at)gmail(dot)com>, pgsql-docs(at)lists(dot)postgresql(dot)org |
| Subject: | Re: [PATCH] docs: document N'...' national character string literal syntax |
| Date: | 2026-02-04 01:31:36 |
| Message-ID: | CAKFQuwaduQejuEz_S2RxwQ5FivU-N7q7ArnbfeXepjD7VotPVQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-docs |
On Tue, Feb 3, 2026 at 5:24 PM Erik Wienhold <ewie(at)ewie(dot)name> wrote:
> On 2026-02-02 21:04 +0100, Hoda Salim wrote:
> > > nchar is an alias of bpchar. There's no cast to char behind the scenes
> > > since that would truncate the string:
>
I'm seeing things differently:
postgres=# select '123 '::nchar, '123 '::bpchar;
bpchar | bpchar
--------+--------
1 | 123
(1 row)
Not a huge fan of the proposed wording but partly because the rest of the
section doesn't mention data types at all so this is going to move the bar
forward, leaving the others behind.
nit: I don't like saying N'...' is equivalent to a data type; something
more like N'...' produces a value of type bpchar.
Also, any reason not to just say:
"This syntax is accepted for compatibility with the SQL standard." and
move on? Repeating "uses a single character set/does not implement a
separate national character set" seems unnecessary. If there is at least
one secondary consideration for accepting this syntax we should state what
it is.
I'd copy the E'...' wording in the first paragraph:
... just before the opening single quote, e.g., N'foo'.
I'd suggest going even further in the emulation by leading with the title
of the thing being described, then the syntax. i.e., flip the ordering of
the first two sentences and rework for flow.
My thought:
For compatibility with the SQL standard, PostgreSQL accepts national
character string constants. A national character string constant is
specified by writing the letter N (upper or lower case) just before the
opening single quote, e.g., N'foo'. (When continuing a national character
string constant across lines, write N only before the first opening
quote.) PostgreSQL's implementation requires that characters comprising
the literal be encoded using the database encoding, just like all other
string constants. In fact, the concept of national character strings is
implemented purely at the SQL syntax layer (including data type names nchar
and nchar varying). Within the database, the bpchar and bpchar(n) data
types are used.
A similar note would be added to Data Types. I'd add after Example 8.1:
The SQL standard defines two additional data types pertaining to national
character strings. PostgreSQL only accommodates a single, database-wide,
character set via its database encoding, and so gains no practical benefit
from these distinct data types. However, as a compatibility shim,
PostgreSQL does implement SQL syntax to accept the nchar and nchar varying
data types. These get mapped onto bpchar and bpchar(n) (and thus
character) data types respectively.
David J.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2026-02-04 05:22:43 | Re: [PATCH] docs: document N'...' national character string literal syntax |
| Previous Message | Erik Wienhold | 2026-02-04 00:24:34 | Re: [PATCH] docs: document N'...' national character string literal syntax |