Skip site navigation (1) Skip section navigation (2)

Re: Importing normalised data by SQL script in remote DB

From: Andreas <maps(dot)on(at)gmx(dot)net>
To: Sean Davis <sdavis2(at)mail(dot)nih(dot)gov>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Importing normalised data by SQL script in remote DB
Date: 2008-04-16 03:01:42
Message-ID: 48056C16.3040205@gmx.net (view raw or flat)
Thread:
Lists: pgsql-novice
Sean Davis schrieb:
> On Tue, Apr 15, 2008 at 9:54 PM, Andreas <maps(dot)on(at)gmx(dot)net> wrote:
>   
>> Hi,
>>  I've got to import data into a remote database.
>>  I get some stuff usually as excel-files that doesn't fit the db-structure
>> in respect of normalisation so I import it into Access, brush up the data
>> and push the columns from there in the right tables via ODBC.
>>
>>  For bigger imports (~5000 lines in Excel that get spread over 4-6 tables in
>> my db) I fetch the remote DB, do the import locally and transfer a dump back
>> to the remote site when noone uses the server.
>>  This is getting unwieldy as the db grows and the connection is not really
>> that fast.
>>
>>  I can access the server by ssh so it might be way faster to run a prepared
>> SQL file that consists just of the new data against the DB on the console.
>>
>>  Could I build a SQL script that adds the new stuff and creates foreign keys
>> on the fly without the need to know the new IDs before?
>>
>>  Lets suppose I got a couple of foreign-linked tables:
>>  company (company_id  serial primary key,  name, ...)
>>  person (person_id  serial primary key, company_fk, name, ...)
>>  contact (contact_id  serial primary key, person_fk, contact_date,
>> contact_notes, ...)
>>
>>  The SQL script needed to:
>>  1)   INSERT INTO company ...
>>  2)   look up the last created company_id
>>  3)   INSERT INTO person    and use the company_id to set company_fk
>>  4)   look up the last created person_id
>>  5)   INSERT INTO contact    and use the person_id to set person_fk
>>  Probaply create some other persons with their contacts.
>>  Then another company and so on.
>>
>>  I do this right now with vba in Access but this is painfully slow via
>> InterNet as it constantly has to ask the server for the last ids and the
>> Net's latency spoils all the fun.
>>     
>
> Why not avoid Access altogether.  Load the data into postgres "as-is"
> into a set of loader tables.  Then, you can use SQL to do the data
> munging and inserts, including the foreign key relationships.  You can
> use \copy in psql to load the data, so there is no need to have access
> to the server.
>
> Sean
>
>   
Thanks for your suggestion.
As said usually I get the data as a xls with wrong column names and 
often enough wrong column types, missing infos or cluttered up in 
creative ways, too.
How could I easily import a certain tab within a xls into a temporary 
pg-table?

Then often the data is a dump of JOINs so it is not normalised.
I can't help it. I just get the stuff dumped over me as our customers 
are able or willing to provide.
Most of the time it's like this:
...
company_1, ..., person_1, ...
company_1, ..., person_2, ...
company_1, ..., person_3, ...
company_2, ..., person_4, ...
company_3, ..., person_5, ...
company_3, ..., person_6, ...
...
I've got to catch the repeating companies to avoid doubles.
Until now I try to find some columns that represent a key for the 
company part of the data.
Then sort by this key columns, walk over the table and generate just a 
company whenever a part of the key changes.

AFAIK I need to insert the person records right after I created their 
company because if I first create all companies and then the people I 
wouldn't know the company-id that belongs to a person.

I'd love to learn how to do this more cleverly w/o Access.


regards
Andreas

In response to

Responses

pgsql-novice by date

Next:From: p9e883002Date: 2008-04-16 07:46:54
Subject: Inserting a record returned from a function
Previous:From: Sean DavisDate: 2008-04-16 02:05:51
Subject: Re: Importing normalised data by SQL script in remote DB

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group