Re: need help with import

From: David Johnston <polobo(at)yahoo(dot)com>
To: Andreas <maps(dot)on(at)gmx(dot)net>
Cc: "pgsql-sql(at)postgresql(dot)org" <pgsql-sql(at)postgresql(dot)org>
Subject: Re: need help with import
Date: 2012-02-16 02:41:43
Message-ID: D4BF5726-54B8-4887-B038-F8AA3E1792FF@yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Feb 15, 2012, at 21:05, Andreas <maps(dot)on(at)gmx(dot)net> wrote:

> Am 16.02.2012 02:13, schrieb David Johnston:
>> -----Original Message-----
>> From: pgsql-sql-owner(at)postgresql(dot)org [mailto:pgsql-sql-owner(at)postgresql(dot)org] On Behalf Of Andreas
>> Sent: Wednesday, February 15, 2012 8:03 PM
>> To: pgsql-sql(at)postgresql(dot)org
>> Subject: [SQL] need help with import
>>
>> Hi
>> I get CSV files to import.
>> Th structure is like this.
>> main part, sub part
>> Could be like this
>>
>> A, a1
>> A, a2
>> A, a3
>> B, b1
>> B, b2
>>
>> The database has a table for main_part and one for sub_part.
>> The relation needs to be n:m so there is a relation table that holds ( main_id, sub_id ).
>> The 2 primary keys main_part.id and sub_part.id are both serials.
>>
>> Is there a way to do an import with SQL?
>>
>> I can read the CSV into a temporary table and I can do a INSERT INTO main_part ( ... ) SELECT DISTINCT main columns FROM import; as well as a INSERT INTO sub_part ( ... ) SELECT sub columns FROM import;
>>
>> But how would I know what main_id and sub_id to insert into the n:m relation?
>>
>> At first when I do the import the relation is actually 1:n.
>>
>> --------------------------------------------
>>
>> You will need to use the temporary table and perform multiple insert+select.
>>
>> I do not understand where you are confused. It would help to provide more meaningful sample data and/or the final result you are trying to achieve. Keep in mind any n:m setup requires three tables with the joining table usually having some descriptive meaning. Is time one of your components that you are not showing us?
>>
>>
> As you say there are 3 tables
> main_part ( id serial primary key, ... )
> sub_part ( id serial primary key, ... )
> main_to_sub ( main_id, sub_id )
>
> I would read the csv into a temporary table "import" and insert the main columns into main_part ().
> Then there are new tuples in main_part()
> 42, A
> 43, B
>
> Now I insert the sub columns into sub_part()
> I'll get e.g.
> 1000, a1
> 1001, a2
> 1002, a3
> 1003, b1
> 1004, b2
>
> To fill up main_to_sub ( ... ) I needed to know which sub_id belongs to which main_id.
> ( 42, 1000 )
> ( 42, 1001 )
> ( 42, 1002 )
> ( 43, 1003 )
> ( 43, 1004 )
>
> I could compare every main-column in "import" to every related data-column in main_part to get the newly created main_id and do the same with every sub-data-column but this seems to be a wee bit tedious.
>
> Is there a more elegant way hat I don't see, yet?
>

Ditch the whole idea of using a sequence and take your primary keys from the source data. Otherwise yes, you will need to perform the join between the import and live tables to determine the newly created identifier.

The question to answer is if you see the same values in subsequent import files do you create a new sequence value or reuse the existing value? Why?

David J.

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Raj Mathur ( राज =?utf-8?b?IOCkruCkvuCkpeClgeCksA==?=) 2012-02-16 02:55:29 Re: need help with import
Previous Message Andreas 2012-02-16 02:05:52 Re: need help with import