Re: Post-mature optimization...

From: Eric Walstad <eric(at)ericwalstad(dot)com>
To: sfpug(at)postgresql(dot)org
Subject: Re: Post-mature optimization...
Date: 2007-01-18 06:01:56
Message-ID: 45AF0D54.7020600@ericwalstad.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: sfpug

Hey Josh,
Josh Berkus wrote:
> Here's what I would do:
> 1. import data into holding table
> 2. index PK on holding table
> 3. insert into swap table all records from old table which are not in holding
> table.
> 4. insert into swap table all records from holding table
> 5. drop old table
> 6. rename holding table to old table; index.
>
> I think the approach you're taking now is both more complex and more
> labor-intensive.
I'm assuming you mean (please confirm):
- 'holding table' is a temporary table into which new data is imported
- 'swap table' means a temporary table used as a destination for both
the new data and the old data that's not in the new data.
If so, the difference between what you describe and what I described is
your additional 'swap table' and a change in the order of the steps.
Also note that I cannot drop the old table because it also contains
records that cannot be deleted. It contains service_account data from
other organizations than that which I'm trying to import.
I assumed that leaving the WHERE clause out of my example would simplify
it while still modeling the problem accurately. Here's my INSERT
statement with the WHERE clause that filters in this organization's
records that were not imported with the new data:
INSERT INTO existing_data
SELECT * FROM foo
WHERE service_account NOT IN
(SELECT service_account FROM existing_data
WHERE ut_id = 1 and account_type = 'c');

> Other comments:
>> INSERT INTO new_data VALUES ('6', 'n');
>
> You're not really using INSERTs, are you?
No, that was just sample data to model the problem. The new data is
imported with:
COPY existing_data (<field list>) FROM stdin;
<looooong list of tab delimited records>
\.

>> -- Copy existing records to a temporary table, 'foo'
>> CREATE TEMP TABLE foo AS SELECT * FROM existing_data;
>> CREATE INDEX foo_service_account
>> ON foo (service_account);
>
> What's "foo"?
It's a temporary holding area for the existing data. I think it's what
you called 'swap table' above.

>> -- copy existing data that are not in new data
>> -- ***** This is where my psql script churns for days *****
>> INSERT INTO existing_data
>> SELECT * FROM foo
>> WHERE service_account NOT IN
>> (SELECT service_account FROM existing_data);
>
> Again with the "foo". I think that in genericizing your example you've made
> it a bit nonsensical.
I'm essentially:
1. moving (copy, delete) all the existing data from the existing_data
table (to foo/swap)
2. importing new data (to the existing_data table)
3. move back existing data (from foo/swap) not in new data (now in
existing_data table).

Thanks for your time and input, Josh. Do you have any other advice
after reading my explanations above?

Eric.

In response to

Responses

Browse sfpug by date

  From Date Subject
Next Message Quinn Weaver 2007-01-18 06:16:35 Re: VoIP recommendations
Previous Message Eric Walstad 2007-01-18 05:24:40 Re: VoIP recommendations