Re: Please comment on pgsql speed at handling 550,000 records

From: Srinivas Iyyer <srini_iyyer_bio(at)yahoo(dot)com>
To: Michael Swierczek <mike(dot)swierczek(at)gmail(dot)com>
Cc: pgsql-novice(at)postgresql(dot)org
Subject: Re: Please comment on pgsql speed at handling 550,000 records
Date: 2006-02-07 18:33:28
Message-ID: 20060207183328.45953.qmail@web34507.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Hi Mike,
I am using a Dell Precission 670,
RAM: 2 GB
Processor: two Intel® XeonTM 3.00GHz, 2MB L2 Cache
OS : Red Hat® Enterprise Linux WS v4,
Chipset : .Intel E7525 chipset
Disks : 1.5TB SATA disk space (3x500); If I am not
wrong they are 7200RPM disks

thanks
Srini

--- Michael Swierczek <mike(dot)swierczek(at)gmail(dot)com>
wrote:

> Dear Srinivas,
> It may be helpful to know what operating
> system you are using, and
> also the amount of system RAM, the processor type,
> and the hard drive sizes
> and speeds.
>
> -Mike
>
> On 2/7/06, Srinivas Iyyer
> <srini_iyyer_bio(at)yahoo(dot)com> wrote:
> >
> >
> > dear group,
> > I am running a file with 550,000 lines. This data
> is
> > not so complex. The scheme is
> > Table A
> >
> > seq_id | seq_name | seq_identifier
> > 1 | ABC | NM_0000023
> > ....
> >
> > (23k lines)
> >
> >
> > Table B:
> >
> > gos_id | go_id | go_cat | go_name
> > 1 GO:00003 | P | death
> > ......
> > (7k)
> >
> >
> > Now table C
> >
> > seq_id | gos_id
> > 1 1
> > 1 | 2
> > 1 | 200
> >
> > ...
> > (550K lines)
> >
> >
> >
> >
> > Table Temp_C
> > seq_name | go_id
> > ABC | GO:9993934
> > ABC | GO:3489343
> > ....
> > (550,500 lines)
> >
> >
> > question:
> >
> > TAble A and B populated easily without question.
> >
> > Table C, is completely a relationship table.
> > I wrote a python script that writes all 500K lines
> > with select statements inside.
> >
> > insert into tablec (seq_id,gos_id) values (
> > (select seq_id from table a where seq_name
> ='xxxx'),
> > (select gos_id from table b where go_id =
> 'xxxx'));
> >
> >
> > Such 500K + insert statements took a long time and
> I
> > had to abort it.
> >
> > 3.5 hrs of time, 300K lines got inserted
> >
> >
> > I wrote a plpgsql function:
> >
> > create function gosup() returns integer AS '
> > DECLARE
> > referrer_key RECORD;
> > BEIGN
> > FOR referrer_key IN SELECT * from TEMP_C LOOP
> > INSERT INTO tableC(seq_id,gos_id) values(
> > (select seq_id from table a where seq_name
> > =referrer_key.seq_name),
> > (select gos_id from table b where go_id =
> > referrer_key.go_id))
> > END LOOP
> > return 0;
> > END;
> > ' LANGUAGE plpgsql;
> >
> > so this function is written to make the inserting
> > faster.
> >
> > I used \copy command to create temp_c.
> >
> >
> > The ironical point is, even now it is taking over
> an
> > hour (still running as of now).
> >
> > So experts,
> >
> > do you really thing inserting 550K lines is really
> a
> > long time consuming step no matter what method you
> > choose (such as regular insert statements or
> writing a
> > procedure to automate this step on server side).
> >
> > Could any one throw some light on DB performance
> (no
> > matter wheter it is postgres, oracle, SyBase or
> > MySQL).
> >
> > In specific what is the state of efficiency in
> > postgres.
> >
> > Thank you.
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam? Yahoo! Mail has the best spam
> protection around
> > http://mail.yahoo.com
> >
> > ---------------------------(end of
> broadcast)---------------------------
> > TIP 1: if posting/reading through Usenet, please
> send an appropriate
> > subscribe-nomail command to
> majordomo(at)postgresql(dot)org so that your
> > message can get through to the mailing list
> cleanly
> >
>

__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Srinivas Iyyer 2006-02-07 18:36:04 Re: Please comment on pgsql speed at handling 550,000 records
Previous Message Tom Lane 2006-02-07 18:33:17 Re: Please comment on pgsql speed at handling 550,000 records