Re: partitioning a dataset + employing hysteresis condition

From: Amit Dor-Shifer <amit(dot)dor(dot)shifer(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: partitioning a dataset + employing hysteresis condition
Date: 2011-11-16 11:03:59
Message-ID: CAAznTxGDtD=Rg+0AXuM5+33P-B52p325JXYuPLY8PCUTDEjPMA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Nov 16, 2011 at 4:58 PM, David Johnston <polobo(at)yahoo(dot)com> wrote:

> On Nov 15, 2011, at 15:28, Gavin Flower <GavinFlower(at)archidevsys(dot)co(dot)nz>
> wrote:
>
> On 14/11/11 18:35, Amit Dor-Shifer wrote:
>
>
> On Mon, Nov 14, 2011 at 4:29 PM, Amit Dor-Shifer <<amit(dot)dor(dot)shifer(at)gmail(dot)com>
> amit(dot)dor(dot)shifer(at)gmail(dot)com> wrote:
>
>> Hi,
>> I've got this table:
>> create table phone_calls
>> (
>> start_time timestamp,
>> device_id integer,
>> term_status integer
>> );
>>
>> It describes phone call events. A 'term_status' is a sort-of an exit
>> status for the call, whereby a value != 0 indicates some sort of error.
>> Given that, I wish to retrieve data on devices with a persisting error on
>> them, of a specific type. I.E. that their last term_status was, say 2. I'd
>> like to employ some hysteresis on the query: only consider a device as
>> errorring if:
>> 1. the last "good" (0) term_status pre-dates a "bad" (2) term_status.
>> 2. it has at least N "bad" term_status events following the last "good"
>> one.
>> 3. The time span between the first "bad" term_status event and the last
>> one is >= T minutes
>>
>> For instance, w/the following data set:
>>
>>
> Alternative thought,
>
> Have a Boolean field which is set to true for non-zero entries and false
> for zeros. Upon entering a zero into the table, for a given device, set
> all currently true records to false. Combine with a partial index on the
> true and you can quickly get a listing of all devices in error mode and all
> the recent error entries.
>
> David J.
>

Really appreciate the attention. Thanks!

Here's what I've so-far come up with:

SELECT pc.device_id, MAX(lsc.last_successful) AS last_successful,
MIN(pc.start_time) AS faulting_since
FROM (
SELECT MAX(start_time) AS last_successful, device_id FROM phone_calls
WHERE term_status IS NOT NULL AND term_status = 0
GROUP BY device_id
UNION
SELECT NULL AS last_successful, device_id FROM phone_calls
GROUP BY device_id
HAVING EVERY(term_status = 2) = TRUE
) AS lsc,
phone_calls pc
WHERE
pc.device_id=lsc.device_id
AND pc.term_status=2
AND (lsc.last_successful is NULL OR pc.start_time > lsc.last_successful)
GROUP BY pc.device_id
HAVING MIN(pc.start_time) < ?
AND COUNT(term_status) >= ?

The two parameters to the query are T & N, in order, with a slight change:
T is a timestamp, so instead of specifying the a time span of 2 minutes, I
pass it as NOW() - INTERVAL '2 minute'.

with T=NOW() - INTERVAL '2 minute' I get the following on the a/m dataset:
N=4:
40;"2010-07-01 00:40:22";"2010-07-01 00:41:22"

N=3:
40;"2010-07-01 00:40:22";"2010-07-01 00:41:22"
50;"2010-07-01 12:00:00";"2010-07-01 12:01:00"
2;"2011-11-16 21:56:59.52107";"2011-11-16 21:57:59.52107"

N=2:
40;"2010-07-01 00:40:22";"2010-07-01 00:41:22"
2;"2011-11-16 21:55:16.88869";"2011-11-16 21:56:16.88869"
30;"2010-07-01 15:14:33";"2010-07-01 15:15:33"
50;"2010-07-01 12:00:00";"2010-07-01 12:01:00"

* The bit with the union is to accommodate for devices which never see a
single successful term_status.

Thanks a lot for the helpful hints :)

In response to

Browse pgsql-general by date

  From Date Subject
Next Message kyp404 2011-11-16 11:50:17 PostgreSQL-Slony error?
Previous Message MikeW 2011-11-16 09:26:46 Re: syntax highlighting in emacs after \e in psql