Efficiently move data from one table to another, with FK constraints?

From: Rob W <digital_illuminati(at)yahoo(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Efficiently move data from one table to another, with FK constraints?
Date: 2009-07-06 20:22:29
Message-ID: 648831.81143.qm@web34608.mail.mud.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


I am using COPY to bulk load large volumes (i.e. multi GB range) of data to a staging table in a PostgreSQL 8.3. For performance, the staging table has no constraints, no primary key, etc. I want to move that data into the "real" tables, but need some advice on how to do that efficiently.

Here's a simple, abbreviated example of tables and relations I'm working with (in reality there are a lot more columns and foreign keys).

/* The raw bulk-loaded data. No indexes or constraints. */
CREATE TABLE log_entry (
req_time TIMESTAMP NOT NULL,
url TEXT NOT NULL,
bytes INTEGER NOT NULL
);

/* Where the data will be moved to. Will have indexes, etc */
CREATE TABLE request (
id BIGSERIAL PRIMARY KEY,
req_time TIMESTAMP WITH TIME ZONE NOT NULL,
bytes INTEGER NOT NULL,
fk_url INTEGER REFERENCES url NOT NULL,
);

CREATE TABLE url (
id SERIAL PRIMARY KEY,
path TEXT UNIQUE NOT NULL,
);

Is there a way to move this data in bulk efficiently? Specifically I'm wondering how to handle the foreign keys? The naive approach is:

1) For each column that is a foreign key in the target table, do INSERT ... SELECT DISTINCT ... to copy all the values into the appropriate child tables.
2) For each row in log_entry, do a similar insert to insert the data with the appropriate foreign keys.
3) delete the contents of table log_entry using TRUNCATE

Obviously, this would be very slow when handling tens of millions of records. Are there faster approaches to solving this problem?

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Bruce Momjian 2009-07-06 20:26:30 Re: Windows installer for pg-8.4 confusion
Previous Message Paul Smith 2009-07-06 20:21:22 Out of memory error