| From: | Alessandro Gagliardi <alessandro(at)path(dot)com> |
|---|---|
| To: | Peter van Hardenberg <pvh(at)pvh(dot)ca> |
| Cc: | pgsql-performance(at)postgresql(dot)org |
| Subject: | Re: efficient data reduction (and deduping) |
| Date: | 2012-03-01 19:28:32 |
| Message-ID: | CAAB3BBKS2ncPZROej1OtE_k-DAEsRpJ2JP8w7zhoVJ8+i3EfDA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-performance |
I was thinking of adding an index, but thought it would be pointless since
I would only be using the index once before dropping the table (after its
loaded into hourly_activity). I assumed it would take longer to create the
index and then use it than to just seq scan once or twice. Am I wrong in
that assumption?
On Thu, Mar 1, 2012 at 10:40 AM, Peter van Hardenberg <pvh(at)pvh(dot)ca> wrote:
> On Thu, Mar 1, 2012 at 10:27 AM, Alessandro Gagliardi
> <alessandro(at)path(dot)com> wrote:
> > Now, I want to reduce that data to get the last activity that was
> performed
> > by each user in any given hour. It should fit into a table like this:
> >
>
> How about:
>
> 1) Create an expression based index on date_trunc('hour', hr_timestamp)
> 2) Create a view on that showing the last value
> 3) If you want to throw away the data use CREATE TABLE AS on the
> results of the view.
>
> You may also want to investigate window functions.
>
> -p
>
> --
> Peter van Hardenberg
> San Francisco, California
> "Everything was beautiful, and nothing hurt." -- Kurt Vonnegut
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alessandro Gagliardi | 2012-03-01 19:29:00 | Re: efficient data reduction (and deduping) |
| Previous Message | Kääriäinen Anssi | 2012-03-01 19:06:40 | Re: Large insert and delete batches |