Quick Links

Re: suggestions to improve postgresql suitability for

From:	Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To:	PostgreSQL Developers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: suggestions to improve postgresql suitability for
Date:	2003-07-24 12:42:59
Message-ID:	Pine.LNX.4.56.0307241442060.15688@sablons.ensmp.fr
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> > You want to process all invoices to count them and to sum up the
> > amounts on a per month/area/type basis. The initial data size is in
> > GB, but the size of the expected result is in KB (namely 2 data for
> > each 100 areas * 12 months * 4 types).
>
> The key to handling large datasets for data mining is pre-aggregation
> based on the smallest time frame needed for details. I'd suggest running
> these large queries and storing the results in other tables, and then
> writing a set of functions to work with those aggregate tables.

Sure, that's what I do. I do not want to pay several joins on 120M tuples.

However, the one or few "initial" queries take some time and a lot of
space, hence my mail about temporary storage and 'on the fly' data
fetching to help improve both speed and temporary storage requirements
for this type of application.

> No sense in summing up the same set of static data more than once if you
> can help it.

Sure. I never did that.

Thanks for your advices anyway,

--
Fabien.

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fabien COELHO	2003-07-24 12:43:42	Re: suggestions to improve postgresql suitability for
Previous Message	Rod Taylor	2003-07-24 10:14:16	Re: this is in plain text (row level locks)