Re: Apparent Problem With NULL in Restoring pg_dump

From: Andy Colson <andy(at)squeakycode(dot)net>
To: Rich Shepard <rshepard(at)appl-ecosys(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Apparent Problem With NULL in Restoring pg_dump
Date: 2011-09-17 03:18:59
Message-ID: 4E7411A3.7050606@squeakycode.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 09/16/2011 04:42 PM, Rich Shepard wrote:
> On Thu, 15 Sep 2011, Andy Colson wrote:
>
>> First you need to trim the \n and spaces:
>>
>> andy=# insert into junk values (E'GW-22');
>> INSERT 0 1
>> andy=# insert into junk values (E'GW-22 \n');
>> INSERT 0 1
>> andy=# insert into junk values (E'GW-22 \n');
>
> Andy,
>
> Here's what worked for me:
>
> nevada=# \i junk.sql
> CREATE TABLE
> nevada=# insert into junk select * from chemistry where site_id = (E'GW-22');
> INSERT 0 803
> nevada=# insert into junk select * from chemistry where site_id = (E'GW-22 \n');
> INSERT 0 0
> nevada=# insert into junk select * from chemistry where site_id = (E'GW-22 \n');
> INSERT 0 0
> nevada=# insert into junk select * from chemistry where site_id = (E'GW-22\n');
> INSERT 0 1409
> nevada=# select '['|| rtrim(trim(trailing E'\n' from site_id)) || ']' from junk;
>
> ?column? ----------
> [GW-22]
> [GW-22]
>
> and so on for 2212 rows.
>
>> Trim it up:
>>
>> andy=# select '['|| rtrim(trim(trailing E'\n' from a)) || ']' from junk;
>
>> If you have a unique index you'll wanna drop it first. Once you get that done, we can remove the dups.
>
> No index on junk; I can remove it from chemistry prior to reinserting the
> cleaned rows.
>
> Also, where can I read about the select syntax you use? I find nothing
> about it in Rick van der Lans' 4th edition, the most comprehensive language
> reference I've read.
>
> Thanks,
>
> Rich
>

The fine online manual:

http://www.postgresql.org/docs/current/interactive/index.html

Especially the string ops:

http://www.postgresql.org/docs/current/interactive/functions-string.html

>> Trim it up:
>> andy=# select '['|| rtrim(trim(trailing E'\n' from a)) || ']' from junk;
>
> Andy,
>
> Scrolling through the table with rows ordered by date and chemical I find
> no duplicates ... so far. However, what I do find is that the above did not
> work:

No, it wasnt supposed to. A select statement builds a new result set and returns it to you, it wont update a table. That select statement was meant as an example for writing an update statement.

Like:

update chemistry set side_id = rtrim(trim(trailing E'\n' from site_id));

If there was a unique index on chemistry(site_id), the above would throw an error, so I was warning you to drop it.

Once the site_id was trimmed, you could then delete the dups, with:

delete from chemistry where site_id = 'GW-22' and ctid <> (select min(ctid) from chemistry site_id = 'GW-22');

Those 11 steps you had... I was thinking two steps. The update and the delete above.

Sorry, I should have been a little more clear, but, at least you got things cleaned up. PG has a huge number of data manipulation functions. If you have to export data out of a database in order to massage it, then that's a failure of a database. PG (and sql) were meant for just this kind of job.

-Andy

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Raghavendra 2011-09-17 07:09:46 Re: How to get Transaction Timestamp ?
Previous Message Marti Raudsepp 2011-09-17 03:11:48 Re: Arrays