I have an unique requirement. I have a feed of 2.5 - 3 million rows of data
which arrives every 1/2 an hour. Each row has 2 small string values (about
50 chars each) and 10 int values. I need searcheability and running
arbitrary queries on any of these values. This means i have to create an
index on every column. The feed comes in as a text file comma separated.
Here is what i am planning to do
1) create a new table every time a new feed file comes in. Create table with
indexes. Use the copy command to dump the data into the table.
2) rename the current table to some old table name and rename the new table
to current table name so that applications can access them directly.
Note that these are read only tables and it is fine if the step 2 takes a
small amount of time (it is not a mission critical table hence, a small
downtime of some secs is fine).
My question is what is the best way to do step (1) so that after the copy is
done, the table is fully indexed and properly balanced and optimized for
Should i create indexes before or after import ? I need to do this in
shortest period of time so that the data is always uptodate. Note that
incremental updates are not possible since almost every row will be changed
in the new file.
my table creation script looks like this
create table datatablenew(fe varchar(40), va varchar(60), a int, b int, c
int, d int, e int, f int, g int, h int, i int, j int, k int, l int, m int, n
int, o int, p int, q real);
create index fe_idx on datatablenew using hash (fe);
create index va_idx on datatablenew using hash(va);
create index a_idx on datatablenew (a);
create index q_idx on datatablenew(q);
pgsql-performance by date
|Next:||From: Steinar H. Gunderson||Date: 2006-02-10 21:03:19|
|Subject: Re: help required in design of database|
|Previous:||From: PFC||Date: 2006-02-10 20:14:06|
|Subject: Re: Basic Database Performance|