Skip site navigation (1) Skip section navigation (2)

Re: One table or many tables for data set

From: "Castle, Lindsay" <lindsay(dot)castle(at)eds(dot)com>
To: "'Joe Conway'" <mail(at)joeconway(dot)com>,pgsql-performance(at)postgresql(dot)org
Subject: Re: One table or many tables for data set
Date: 2003-07-23 01:25:07
Message-ID: B09017B65BC1A54BB0B76202F63DDCCA053248A4@auntm201 (view raw, whole thread or download thread mbox)
Lists: pgsql-performance
Apologies, let me clear this up a bit (hopefully) :-)

The data structure looks like this:

There are approx 12,000 distinct elements for a total of about 6 million
rows of data.

The scanning technology I want to use may need a different number of rows
and different columns depending on the scan formula;
	eg scan1 may need num1, num2 and num3 from the last 200 rows for
element "x"
	   scan2 may need num1, units from the last 10 rows for element "y"

I can either do the scans and calculate what i need within SQL or drag the
data out and process it outside of SQL, my preference is to go inside SQL as
I've assumed that would be faster and less development work.

If I went with the many tables design I would not expect to need to join
between tables, there is no relationship between the different elements that
I need to cater for.



Castle, Lindsay wrote and <snipped>:
> I'm working on a project that has a data set of approximately 6million
> with about 12,000 different elements, each element has 7 columns of data.
> I'm wondering what would be faster from a scanning perspective (SELECT
> statements with some calculations) for this type of set up;
> 	one table for all the data
> 	one table for each data element (12,000 tables)
> 	one table per subset of elements (eg all elements that start with
> "a" in a table)

I, for one, am having difficulty understanding exactly what your data 
looks like, so it's hard to give advice. Maybe some concrete examples of 
what you are calling "rows", "elements", and "columns" would help.

Does each of 6 million rows have 12000 elements, each with 7 columns? Or 
do you mean that out of 6 million rows, there are 12000 distinct kinds 
of elements?

> Can I do anything with Indexing to help with performance?  I suspect for
> majority of scans I will need to evaluate an outcome based on 4 or 5 of
> 7 columns of data.

Again, this isn't clear to me -- but maybe I'm just being dense ;-)
Does this mean you expect 4 or 5 items in your WHERE clause?


pgsql-performance by date

Next:From: Joe ConwayDate: 2003-07-23 01:36:20
Subject: Re: One table or many tables for data set
Previous:From: Joe ConwayDate: 2003-07-23 01:02:21
Subject: Re: One table or many tables for data set

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group