Re: efficient data reduction (and deduping)

From: Alessandro Gagliardi <alessandro(at)path(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: efficient data reduction (and deduping)
Date: 2012-03-01 19:35:48
Message-ID: CAAB3BBL_6ju5QS2qEbmRAtePN2BOi==dBig925RjXH2QyyGzwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Interesting solution. If I'm not mistaken, this does solve the problem of
having two entries for the same user at the exact same time (which violates
my pk constraint) but it does so by leaving both of them out (since there
is no au1.hr_timestamp > au2.hr_timestamp in that case). Is that right?

On Thu, Mar 1, 2012 at 10:35 AM, Claudio Freire <klaussfreire(at)gmail(dot)com>wrote:
>
> Try
>
> INSERT INTO hourly_activity
> SELECT ... everything from au1 ...
> FROM activity_unlogged au1
> LEFT JOIN activity_unlogged au2 ON au2.user_id = au1.user_id
> AND
> date_trunc('hour', au2.hr_timestamp) = date_trunc('hour',
> au1.hr_timestamp)
> AND
> au2.hr_timestamp < au1.hr_timestamp
> WHERE au2.user_id is null;
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Claudio Freire 2012-03-01 19:39:12 Re: efficient data reduction (and deduping)
Previous Message Alessandro Gagliardi 2012-03-01 19:30:06 Re: efficient data reduction (and deduping)