Skip site navigation (1) Skip section navigation (2)

Re: Post-mature optimization...

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: sfpug(at)postgresql(dot)org
Cc: Eric Walstad <eric(at)ericwalstad(dot)com>
Subject: Re: Post-mature optimization...
Date: 2007-01-18 02:53:56
Message-ID: 200701171853.56304.josh@agliodbs.com (view raw or flat)
Thread:
Lists: sfpug
Eric,

> I've a table with ~2M records.  Very infrequently records are
> added/modified in this table.  Very regularly I must load data from
> another source into this table (overwriting all existing records that
> are included in the new data).
>
> Problem:
> My current approach is taking way too long; in fact, I've not been able
> to complete the import job (it's been running for days now).

Here's what I would do: 
1. import data into holding table
2. index PK on holding table
3. insert into swap table all records from old table which are not in holding 
table.
4. insert into swap table all records from holding table
5. drop old table
6. rename holding table to old table; index.

I think the approach you're taking now is both more complex and more 
labor-intensive.  Other comments:

> INSERT INTO new_data VALUES ('2', 'n');
> INSERT INTO new_data VALUES ('3', 'n');
> INSERT INTO new_data VALUES ('4', 'n');
> INSERT INTO new_data VALUES ('6', 'n');

You're not really using INSERTs, are you?

> -- Copy existing records to a temporary table, 'foo'
> CREATE TEMP TABLE foo AS SELECT * FROM existing_data;
> CREATE INDEX foo_service_account
>  ON foo (service_account);

What's "foo"?

> -- delete existing records
> DELETE FROM existing_data;

TRUNCATE existing_data;

> -- copy existing data that are not in new data
> -- ***** This is where my psql script churns for days *****
> INSERT INTO existing_data
>  SELECT * FROM foo
>  WHERE service_account NOT IN
>   (SELECT service_account FROM existing_data);

Again with the "foo".  I think that in genericizing your example you've made 
it a bit nonsensical.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

In response to

Responses

sfpug by date

Next:From: David FetterDate: 2007-01-18 04:39:24
Subject: Re: VoIP recommendations
Previous:From: Jeff FrostDate: 2007-01-18 01:33:06
Subject: Re: Post-mature optimization...

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group