Skip site navigation (1) Skip section navigation (2)

Re: efficient data reduction (and deduping)

From: Alessandro Gagliardi <alessandro(at)path(dot)com>
To: Claudio Freire <klaussfreire(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: efficient data reduction (and deduping)
Date: 2012-03-01 19:35:48
Message-ID: CAAB3BBL_6ju5QS2qEbmRAtePN2BOi==dBig925RjXH2QyyGzwA@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
Interesting solution. If I'm not mistaken, this does solve the problem of
having two entries for the same user at the exact same time (which violates
my pk constraint) but it does so by leaving both of them out (since there
is no au1.hr_timestamp > au2.hr_timestamp in that case). Is that right?

On Thu, Mar 1, 2012 at 10:35 AM, Claudio Freire <klaussfreire(at)gmail(dot)com>wrote:
>
> Try
>
> INSERT INTO hourly_activity
> SELECT ... everything from au1 ...
> FROM activity_unlogged au1
> LEFT JOIN activity_unlogged au2 ON au2.user_id = au1.user_id
>                                                    AND
> date_trunc('hour', au2.hr_timestamp) = date_trunc('hour',
> au1.hr_timestamp)
>                                                    AND
> au2.hr_timestamp < au1.hr_timestamp
> WHERE au2.user_id is null;
>

In response to

Responses

pgsql-performance by date

Next:From: Claudio FreireDate: 2012-03-01 19:39:12
Subject: Re: efficient data reduction (and deduping)
Previous:From: Alessandro GagliardiDate: 2012-03-01 19:30:06
Subject: Re: efficient data reduction (and deduping)

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group