From: | Andrew Smith <laconical(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)postgresql(dot)org |
Subject: | newbie table design question |
Date: | 2009-05-31 15:54:09 |
Message-ID: | 300497a30905310854j5d34e9e8h3d805ffb5f160da@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi all,
I'm a beginner when it comes to Postgresql, and have a table design question
about a project I'm currently working on. I have 1500 data items that need
to be copied every minute from an external system into my database. The
items have a timestamp, an identifier and a value. For example:
12/10/2008 05:00 ID_ABC 14
12/10/2008 05:01 ID_ABC 17
12/10/2008 05:02 ID_ABC 13
Pretty simple stuff. The current solution (implemented using SQL Server a
few years ago) looks like this (an approximation using Postgresql syntax):
CREATE TABLE "DataImport"
(
"DataImportID" serial NOT NULL PRIMARY KEY,
"Time" timestamp without time zone NOT NULL,
"ID_ABC" integer NOT NULL,
"ID_DEF" integer NOT NULL,
"ID_HIJ" integer NOT NULL,
etc
);
While this design results in only 14400 records being present in the table
per day, I don't like it. One problem is that every time a data item is
added or removed from the import process, the structure of the table needs
to be altered. I don't know what sort of overhead that involves in the
Postgresql world, but I'm thinking that once I define the structure of the
table, I don't want to be messing with it very often, if at all.
My initial thought for the design of the new solution looks like this:
CREATE TABLE "DataImport"
(
"DataImportID" serial NOT NULL PRIMARY KEY,
"Time" timestamp without time zone NOT NULL,
"Identifier" text NOT NULL,
"Value" integer NOT NULL
);
Users will then be doing regular queries on this data (say, a few hundred
times per day), such as:
SELECT "Time", "Value" FROM "DataImport" WHERE "Identifier" = 'ID_ABC' AND
"Time" between '2008-11-07' and '2008-11-11';
My concern is that 1500 values * 14400 minutes per day = 21,600,000
records. Add this up over the course of a month (the length of time I need
to keep the data in this table) and I'll have more than half a billion
records being stored in there.
I guess my question is: is my approach reasonable? I haven't dealt with
tables of this size before (using any DBMS) - should I expect really slow
queries due to the sheer number of records present? Is there some better
way I should be structuring my imported data? All suggestions welcome.
Cheers,
Andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2009-05-31 17:25:25 | Re: newbie table design question |
Previous Message | Craig Ringer | 2009-05-31 13:01:03 | Re: Pl/java in 8.4 bet1 sources compilation failed |