Re: Creating 2D arrays for pg_copy_from, reading tab-delimted text file that contains comma and double quotes

From: rob stone <floriparob(at)gmail(dot)com>
To: s400t(at)yahoo(dot)co(dot)jp, "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Creating 2D arrays for pg_copy_from, reading tab-delimted text file that contains comma and double quotes
Date: 2018-12-17 14:38:24
Message-ID: c1802beea93a82f7e1fb49c198b4d6c2863e4101.camel@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hello,

On Mon, 2018-12-17 at 21:34 +0900, s400t(at)yahoo(dot)co(dot)jp wrote:
> Hello Good People of the Forum!
>
> I am trying to insert some data into a PostgreSQL database using PHP
> and struggling to create an array so that pg_copy_from function will
> accept and process the data.
>
> I can insert data but not the way I want- my data this case contains
> comma, space, double quotes and unpaired double quote.
> I need to use Excel to create the data and save it as tab delimited
> text file, and then convert encoding to UTF-8 (from Excel's ANSI)
> before passing it to the PHP.
>
> Here is my dummy data in Excel:
> rec_no item1 item2 item3 item4 item5
> 1 Denny's orange juice 1,500 yen "Dear John" 32" TV
>
> Explanation: the first row is header.
> The second row is data for each column.
> I tried to create variation like apostrophe, space between words,
> comma, double quotes and unpaired double quote (32" TV).
>
> When I save that Excel as a tab delimited text file, I get this:
> rec_no item1 item2 item3 item4 item5
> 1 Denny's orange juice "1,500 yen" """Dear John"""
> "32"" TV"
> (As seen when I opened that file with Notepad)
>
> Because my data also contains non-ascii, I saved the file with UTF-8
> encoding using the Notepad.
>
> Then I created a two-dimensional array with PHP:
>
> 1.$file = 'test.txt'; //tab delimited file
> 2.$fileRead = fopen($file, 'r');
>
> 3.$row = 1;
> 4.$twoDarray = array();
> 5.while (($line = fgetcsv($fileRead, 0, "\t")) !== FALSE) { //0 means
> I can read row whatever its length
> 6. if($row == 1){ $row++; continue; } //skip header
> 7. $line = implode(" ",$line). "\n";
> 8. $twoDarray[] = $line;
> 9.}
> 10.fclose($fileRead);
>
> Then I passed that twoDarray to pg_copy_from.
>
> $con=pg_connect("host=$host dbname=$dbname port=5432 user=$user
> password=$password");
>
> 11.if (!$con) {
> 12. die("Couldn't open..<br>\n");
> 13.}
>
> 14.if (pg_copy_from($con, $tableName, $twoDarray) !== FALSE) {
> 15. print "OK!";
> 16.}
> 17.else{
> 18. print "Not OK.";
> 19.}
>
> When I run the program, I have this error:
> Warning: pg_copy_from(): Copy command failed:
> ERROR: value too long for type character varying(32) CONTEXT: COPY
> test_table, line 1, column rec_no: "1 Denny's orange juice 1,500 yen
> "Dear John" 32" TV" in testProgram.php line xx.
>
> My table definition is:
> CREATE TABLE test_table (
> rec_no VARCHAR(32) PRIMARY KEY NOT NULL,
> item1 VARCHAR(255),..item2 .. until item5.);
>
> Obviously, my program thinks everything in the data row is for the
> first field.
> No, no.
>
> How to make it think that
> 1 is for the 'rec_no' field,
> Denny's is for the 'item1' field,
> orange juice is for the 'item2' field,
> 1,500 yen is for the 'item3' field,
> "Dear John" is for the 'item4' field and
> 32" TV is for the 'item5' field?
>
> When I tried removing '0' from line 5, that is,
> while (($line = fgetcsv($fileRead, "\t")) !== FALSE) { //without read
> length
>
> I can see data written in the database, but with some extra double
> quotes and a missing comma!
> That is data was saved as
> Denny's, orange juice, "1 500 yen", """Dear John""", and
> "32"" TV"
> into the respective fields.
> I cannot have those extra double quotes, and I cannot have missing
> comma in my data.

fgetscsv returns an array from reading a record from a text file.
So $line = fgetcsv($fileRead, 0, "\t", '"') would use tab as the
delimiter and remove any enclosure character equal to ".

I don't use Excel. With Libreoffice you can set the delimiter to the
pipe character and tell it not to use enclosures. So I don't know how
to massage your Excel file so that 32" TV is rendered correctly.

Secondly, in PHP implode takes an array and turns it into a string with
a specified character used to delimit the values from the array.

I can only suggest that you read the PHP manual, as well as do some
searches for the use of pg_copy_from, although I doubt anything useful
will turn up.

Cheers,
Rob

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jerry Sievers 2018-12-17 15:01:10 Re: conditionally terminate psql script
Previous Message Vijaykumar Jain 2018-12-17 14:32:42 Re: [External] Re: simple query on why a merge join plan got selected