Time consuming process ...3 million records please help

From: Srinivas Iyyer <srini_iyyer_bio(at)yahoo(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Subject: Time consuming process ...3 million records please help
Date: 2006-03-28 00:59:50
Message-ID: 20060328005950.9895.qmail@web38102.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Dear group,
I have two tables and a temp table where I uploaded
data using \copy. This temp table has "3,348,107"
lines of data.

I wrote a plpgsql function to read each record in
temp_table, take the firt col. data (temp_refseq_id)
and get corresponding 'seq_id' from table B and insert
into table A.

I started this process 8 hrs back. It has been running
for last 8 hrs and yet it is not finished. the reason
i did this temp table thing was to speedup process by
writing a server side function. I still did not get
to win over the time issue here. If this is a
continuous problem I will have to search for another
db system since my files from now on are huge and has
over mil records. I am betting a lot of time in this
case.

Could any one help writing a faster function.

thanks
looking forward to hear from people.

Temp_table:

temp_refseq_id | temp_imageid | temp_genbankacc
----------------+----------------+-----------------
NM_003604 | IMAGE:1099538 | AA594716
NM_003604 | IMAGE:853047 | AA668250
NM_001008860 | IMAGE:3640970 | BC011775
NM_001008860 | IMAGE:3640970 | BE737509
NM_001008860 | IMAGE:6040319 | BU079001
NM_001008860 | IMAGE:6040319 | BU078725
NM_001008860 | IMAGE:3451448 | BC000957
NM_001008860 | IMAGE:3451448 | BE539334
NM_001008860 | IMAGE:4794135 | BG708105
NM_001008860 | IMAGE:5214087 | BI911674

Table A : (I want to upload data from temp to here)

spota_id | seq_id | spota_imageid | spota_genbacc
----------+--------+---------------+--------
23 | 54525 | IMAGE:1099538 | AA594716

Table B : This table is seqdump table where seq_id is
a FK in Table B

seq_id | seq_acc | seq_name
------------------------------
54519 | NM_152918 | EMR2
54520 | NM_001008860| CGGBP1
54521 | NM_020040 | TUBB4Q
54522 | NM_017525 | CDC42BPG
54523 | NM_032158 | WBSCR20C
54524 | NM_004047 | ATP6V0B
54525 | NM_003604 | PLCB3

Function:

CREATE FUNCTION tab_update() RETURNS integer AS '
DECLARE
referrer_keys RECORD;
BEGIN
FOR referrer_keys IN SELECT * from
temp_spotanno LOOP
INSERT INTO spotanno(seq_id,
spotanno_imageid,spotanno_genbankacc) values((SELECT
seq_id from seqdump where seq_acc =
referrer_keys.temp_refseq_id),referrer_keys.temp_imageid,referrer_keys.temp_genbankacc);
END LOOP;
return 0;

END;
' LANGUAGE plpgsql;

Thanks
Sri

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Sean Davis 2006-03-28 01:35:34 Re: Time consuming process ...3 million records please help
Previous Message johnf 2006-03-27 19:28:52 Does a connection support multiple transactions.