[NOVICE] Extended display and extended ascii characters

From: Ennio Iannucci <Ennio-Sr(at)ei(dot)hnet>
To: pgsql-novice(at)postgresql(dot)org
Cc: pgsql-general(at)postgresql(dot)org
Subject: [NOVICE] Extended display and extended ascii characters
Date: 2003-03-20 15:39:16
Message-ID: 20030320153916.GA1022@deby.ei.hnet
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-novice

Hi. I'm duplicating a previous message I sent to pgsql-general
which seems to have been astray: it is nowere to be seen (perhaps it was
rejected and no rejection-notice was issued?).

To: psql-general(at)postgresql(dot)org
Subject: Expanded display (\x) seems to affect display of latin1 chars
Message-ID: <20030318010737(dot)GA6211(at)deby(dot)ei(dot)hnet>

Hi to all of you.
This is my first posting to a Mailing list and I hope you will forgive
any misbehaviour I might have incurred in this message.
My name is Ennio and I am doing my best to learn how to use what appears
to be a wonderful and powerful program, in order to be able to transfer
some old dbIII/Clipper files under Linux (Debian).

As to the message itself, I prepared it a few days ago as a bug report
(before discovering about the correct way to submit bugs), so I thought
it could be appropriate to send it as it was, with all the details.
In the meantime I made other tests which represent the problem
encountered in a shorter way (marked as msg#2).

msg#1
-----
Severity: Minor Annoyance (but great curiosity)
Short Description: special characters (beyond ascii char(127)) seem to behave ina very strange way in a postgresql table.

THE FACTS:
=========
I created a pg table from a *.txt file coming from a *.dbf file (MS-DOS codepages 437 and/or 850) containing accented vowels and other characters beyond ascii 127.

The psql command:
mydb=> select autore, titolo, collana from bibl where autore like '%Tour%';
displays the special characters correctly, whereas:
mydb=> select * from bibl;
substitutes the hex char code for the character gliph, that is to say:
you get a <85> instead of char(133) [a with grave].
Moreover, when the expanded display is toggled on (\x), both the above commands produce the same effect, i.e. <hex char codes> instead of proper letter gliph.

mydb=>update bibl set titolo='Rom[char(133)]' where titolo like '%Rom%';
will produce Rom[char(183)], i.e. a capital A with an accent! But, repeating once again the same command, the result would display correctly, i.e. the gliph for char(133). However, a new select given after setting extended display on, will show again the accented capital A (char(183)) instead of the correct char(133).

It may be of some interest to note that:
- I can type the desired special characters using ALT+dec.code from the console in or outside psql (and they display correctly).
- less [| cat] orig_file.txt will show hex code
- vim orig_file.txt will display ~E (tilde+E) instead of char(133).

THESE ARE THE VARIOUS STEPS I MADE:
==================================
# su - postgres

AND AS POSTGRES:
rm -r /var/lib/postgres/data
mkdir /var/lib/postgres/data
initdb -D /var/lib/postgres/data -E latin1

THE SUGGESTED COMMAND TO START POSTGRES:
/usr/lib/postgresql/bin/postmaster -D /var/lib/postgres/data
HANGED AFTER PRINTING THESE LINES:
FindExec: found "/usr/lib/postgresql/bin/postgres" using argv[0]
invoking IpcMemoryCreate(size=1949696)
FindExec: found "/usr/lib/postgresql/bin/postmaster" using argv[0]

SO I STARTED IT WITH THE COMMAND (FROM ROOT):
# /etc/init.d/postgresql start

[OK, I've discovered that a CTRL+C should have completed the command
that seemed to have hanged up]

THEN, BACK AS POSTGRES:
$ psql template1
template1=> create user john with password 'john' createdb;
\q

AND, TO CHECK WHETER OR NOT I HAD THE RIGHT CHARACTER SET:
$ /usr/lib/postgresql/bin/pg_controldata /var/lib/postgres/data
pg_control version number: 71
Catalog version number: 200201121
Database state: IN_PRODUCTION
pg_control last modified: Thu Mar 13 23:27:41 2003
Current log file id: 0
Next log file segment: 1
Latest checkpoint location: 0/2070AC
Prior checkpoint location: 0/1FA424
Latest checkpoint's REDO location: 0/2070AC
Latest checkpoint's UNDO location: 0/0
Latest checkpoint's StartUpID: 11
Latest checkpoint's NextXID: 124
Latest checkpoint's NextOID: 24748
Time of latest checkpoint: Thu Mar 13 23:27:39 2003
Database block size: 8192
Blocks per segment of large relation: 131072
LC_COLLATE: it_IT
LC_CTYPE: it_IT

I BECAME USER JOHN AND AS SUCH:
$ psql template1
template1=> create database mydb;
template1=> \c mydb
mydb=> \i bibl_import.sql # TO CREATE TABLE BIBL FROM A TXT FILE (EX *.DBF)
mydb=> \i bibl_crea.sql # TO GET RID OF EXTRA COLUMNS CONTAINING DELIMITERS

THIS IS THE VERSION I'M USING:
$ psql -V
psql (PostgreSQL) 7.2.1 (pre-packaged, installed with 'apt-get install'
from Debian/Woody)

contains support for: readline, history, multibyte
Portions Copyright (c) 1996-2001, PostgreSQL Global Development Group
Portions Copyright (c) 1996, Regents of the University of California
Read the file COPYRIGHT or use the command \copyright to see the
usage and distribution terms.

AND THESE ARE A FEW LINES FROM THE LOG FILE AFTER STARTING POSTGRES:
/usr/lib/postgresql/bin/postmaster child[2070]: starting with (postgres -d2 -v131072 -p template1 )
/usr/lib/postgresql/bin/postmaster child[2078]: starting with (postgres -d2 -v131072 -p template1 )
/usr/lib/postgresql/bin/postmaster child[2081]: starting with (postgres -d2 -v131072 -p mydb )

ARCHITECTURE:
============
Linux version 2.2.19pre17 (herbert(at)arnor) (gcc version 2.7.2.3) #1 Tue Mar 13 22:37:59 EST 2001
Detected 167047 kHz processor.
Console: colour VGA+ 80x25
Memory: 29784k/32768k available (1744k kernel code, 412k reserved, 672k data, 156k init)
CPU: Intel Pentium MMX stepping 03
PCI: PCI BIOS revision 2.10 entry at 0xfb730
Adding Swap: 192772k swap-space (priority -1)
Linux ...... 2.2.19pre17 #1 Tue Mar 13 22:37:59 EST 2001 i586 unknown

VERSION:
=======
PostgreSQL 7.2.1 on i686-pc-linux-gnu, compiled by GCC 2.95.4
--with-template=linux --prefix=/usr/lib/postgresql --enable-unicode-conversion --with-includes=/usr/include/tcl8.3 --includedir=/usr/include/postgresql --with-python --with-openssl --with-gnu-ld --disable-rpath --enable-odbc --with-unixodbc --with-CXX --enable-recode --with-tcl --with-perl --with-pam --enable-multibyte --enable-debug --enable-syslog --enable-locale --with-tclconfig=/usr/lib/tcl8.3 --with-tkconfig=/usr/lib/tk8.3 --with-maxbackends=64 --with-pgport=5432

I hope the above information will be of some help to reproduce the reported behaviour or to let you discover where I am wrong.

***********************************************
msg#2
-----
What follows is a *.txt file copied from a reduced table extracted from
the bigger table I was testing with.
If I read it with less or vim the accented letters will display well; if
I hexedit the file, however, these letters will be displayed as a '.'
The relative pg table will display well until I turn the 'Extended
display on' (\x): strange characters come out.

autore | titolo | altre_notizie | collana

TOUTAIN J. | L'économie antique | [Pr. acq. FB 377, pc. FB 400] | L'Évolution de l'Humanité. Synthèse Collect. Dirigée par Henri Beer
ZWIRNER Giuseppe | Istituzioni di Matematiche per gli studenti delle facoltà di chimica, agraria, scienze naturali, economia e commercio e statistica. | Parte prima: rist. dell'ottava ediz. riveduta ed ampliata con numerosi esercizi e problemi risolti e proposti. 1973. Parte sec. 4^ ed. |
BARRATT BROWN Michael | Storia economica dell'imperialismo | Tit. or. «After Imperialism» (1970). Tr. di Mirella Miotti. Edizione italiana a cura di Alberto Martinelli. Con 33 tabelle fuori testo. | Storia e classe. 17
(3 rows)

Thank you for your attention and accept my apologizes for the long post.
Ennio.

Browse pgsql-general by date

  From Date Subject
Next Message greg 2003-03-20 15:39:50 Re: PostgreSQL downloads compressed with bzip2 instead of
Previous Message Tom Lane 2003-03-20 15:34:55 Re: transaction problem (delete/select/insert sequence fails, a bug?)

Browse pgsql-novice by date

  From Date Subject
Next Message Fred Soustra 2003-03-20 16:25:56 Problems when deleting data.
Previous Message smauroz 2003-03-20 14:14:22 problem with function