Partial backup of linked tables

From: Cisko <78cisko78(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Partial backup of linked tables
Date: 2008-06-18 16:18:00
Message-ID: 29C5C371-FCE7-4271-AB9A-1C2A8CDDDFDE@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi, i'm new to this ML, i'll try to explain my issue:

I've two tables defined as is (postgresql 8.1):

CREATE TABLE table1
(
_id serial,
num1 int4 not null,
num2 int4 not null,

primary key(_id)
);

CREATE INDEX table1IDX1 ON table1(num1);

CREATE TABLE table2
(
_id serial,
_table1_id int not null,
num3 int4 not null,
num4 int4 not null,

primary key(_id),

foreign key(_table1_id) references table1(_id) on delete CASCADE
);

CREATE INDEX table2IDX1 ON table2(_table1_id);

I need to select only a subset of table1/table2 records and backup
them (to disk).

I proceed as following:

1. Create equivalent tables with _tmp name with indexes and cascade;

CREATE TABLE table1_tmp
(
_id serial,
num1 int4 not null,
num2 int4 not null,

primary key(_id)
);

CREATE INDEX table1_tmpIDX1 ON table1_tmp(num1);

CREATE TABLE table2_tmp
(
_id serial,
_table1_id int not null,
num3 int4 not null,
num4 int4 not null,

primary key(_id),

foreign key(_table1_id) references table1_tmp(_id) on delete CASCADE
);

CREATE INDEX table2_tmpIDX1 ON table2_tmp(_table1_id);

2. Select and insert into table1_tmp a subset of table1 based on a
query (num1 < 10)

INSERT INTO table1_tmp SELECT * from table1 WHERE num1 < 10;

3. Populate other tables with a foreign key;

INSERT INTO table2_tmp SELECT table2.* from table2, table1_tmp WHERE
table2._table1_id = table1_tmp._id;

4. Copy each table into a file (i don't have an 8.2, so that i can't
execute pg_dump with several -t options)

COPY table1_tmp TO "/tmp/table1_tmp.data";
COPY table2_tmp TO "/tmp/table2_tmp.data";

This is only an example, i've more complex tables, but schema is
equivalent to previous.

My question is: There'are some optimization/tips that i can do for
achieve better performance?
When i have several rows (10^6 or greater) returned by query into
table1, that starts to hogs time and CPU.

Doing an EXPLAIN, all queries on join are performed using indexes.

Thanks in advance,
Cisko

Browse pgsql-performance by date

  From Date Subject
Next Message Antonio Perez 2008-06-18 19:43:51 WAL DUDAS
Previous Message Matthew Wakeling 2008-06-18 12:38:42 Re: Tsearch2 Initial Search Speed