Quick Links

Re: efficient data reduction (and deduping)

From:	Alessandro Gagliardi <alessandro(at)path(dot)com>
To:	Peter van Hardenberg <pvh(at)pvh(dot)ca>
Cc:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: efficient data reduction (and deduping)
Date:	2012-03-01 19:28:32
Message-ID:	CAAB3BBKS2ncPZROej1OtE_k-DAEsRpJ2JP8w7zhoVJ8+i3EfDA@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

I was thinking of adding an index, but thought it would be pointless since
I would only be using the index once before dropping the table (after its
loaded into hourly_activity). I assumed it would take longer to create the
index and then use it than to just seq scan once or twice. Am I wrong in
that assumption?

On Thu, Mar 1, 2012 at 10:40 AM, Peter van Hardenberg <pvh(at)pvh(dot)ca> wrote:

> On Thu, Mar 1, 2012 at 10:27 AM, Alessandro Gagliardi
> <alessandro(at)path(dot)com> wrote:
> > Now, I want to reduce that data to get the last activity that was
> performed
> > by each user in any given hour. It should fit into a table like this:
> >
>
> How about:
>
> 1) Create an expression based index on date_trunc('hour', hr_timestamp)
> 2) Create a view on that showing the last value
> 3) If you want to throw away the data use CREATE TABLE AS on the
> results of the view.
>
> You may also want to investigate window functions.
>
> -p
>
> --
> Peter van Hardenberg
> San Francisco, California
> "Everything was beautiful, and nothing hurt." -- Kurt Vonnegut
>

In response to

Re: efficient data reduction (and deduping) at 2012-03-01 18:40:20 from Peter van Hardenberg

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Alessandro Gagliardi	2012-03-01 19:29:00	Re: efficient data reduction (and deduping)
Previous Message	Kääriäinen Anssi	2012-03-01 19:06:40	Re: Large insert and delete batches