Quick Links

Re: Strategy for doing number-crunching

From:	Matthew Foster <matthew(dot)foster(at)noaa(dot)gov>
To:	pgsql-novice(at)postgresql(dot)org
Subject:	Re: Strategy for doing number-crunching
Date:	2012-01-09 21:03:19
Message-ID:	CAP1ZYZFs59c+zM5XL_s=AQedjyDn18C0BLULmCtfarY+E6Yqpw@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-novice

On Wed, Jan 4, 2012 at 3:11 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Matthew Foster <matthew(dot)foster(at)noaa(dot)gov> writes:
> > On Wed, Jan 4, 2012 at 10:48 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Matthew Foster <matthew(dot)foster(at)noaa(dot)gov> writes:
> >>> We have a database with approximately 130M rows, and we need to produce
> >>> statistics (e.g. mean, standard deviation, etc.) on the data. Right
> now,
> >>> we're generating these stats via a single SELECT, and it is extremely
> >>> slow...like it can take hours to return results.
>
> >> What datatype are the columns being averaged? If "numeric", consider
> >> casting to float8 before applying the aggregates. You'll lose some
> >> precision but it'll likely be orders of magnitude faster.
>
> > The data are type double.
>
> Hmm. In that case I think you have some other problem that's hidden in
> details you didn't show us. It should not take "hours" to process only
> 130M rows. This would best be taken up on pgsql-performance; please see
> http://wiki.postgresql.org/wiki/Slow_Query_Questions
>
> regards, tom lane
>

Tom,

I think you are absolutely right. Some additional testing, with the
arithmetic removed from the queries, still shows very slow performance.

I'll do some more digging, and perhaps take this to the performance list.
Thanks for your advice!

Matt

In response to

Re: Strategy for doing number-crunching at 2012-01-04 21:11:21 from Tom Lane

Browse pgsql-novice by date

	From	Date	Subject
Next Message	Christian Tonhäuser	2012-01-11 08:45:01	Too much RAM allocated by webserver when executing an Insert-Statement (npgsql)
Previous Message	JORGE MALDONADO	2012-01-09 20:12:53	Duplicate information in parent and child tables