Re: Counting # of consecutive rows with specified value(s)?

From: David Johnston <polobo(at)yahoo(dot)com>
To: Ken Tanzer <ken(dot)tanzer(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Counting # of consecutive rows with specified value(s)?
Date: 2012-06-07 04:26:45
Message-ID: FCDB9A00-31E0-48A2-98BE-39977344699D@yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Jun 6, 2012, at 22:20, Ken Tanzer <ken(dot)tanzer(at)gmail(dot)com> wrote:

> I'm working with an attendance table, where each person gets a record for each day of class, with an attendance code (ABSENT, ATTENDED, ...). I'm trying to figure out how to get the number of consecutive absences a person has. I'm guessing this can be done without writing a function, but I need some help here.
>
> I can currently test whether someone has at least a specified number of consecutive absences with the query below, but it would be better to get the actual number.
>
> As a second question, what about getting the number of consecutive records for a set of values? (e.g., attendance_code IN ('ATTENDED','EXCUSED')
>
> Any ideas or suggestions? Thanks.
>
> Ken
>
> This query checks for 4 consecutive absences:
>
> SELECT client_id,
> array(
> SELECT attendance_code
> FROM attendance
> WHERE client_id=enrollment.client_id
> ORDER BY attended_on DESC
> LIMIT 4
> )=array_fill('ABSENT'::varchar,array[4]) AS absent_last_4
> FROM enrollment;
>

Quick thought if you have window functions. If you perform a rank over the appropriate subset of columns (with order by) you might be able to get consecutive records with matching codes to have the same rank. The in an outer query you can simply sum with the rank as the group. I am unsure which function gives you equal values for ties, or whether this approach will even work, but it seems worth considering at least. Using a min date your result might be readable as "starting on 9/1, 4 absences; starting on 9/5, 21 attendeds; etc...".

Consider using a frame of 1 prior row to see whether a given date/code is different than the immediately preceding day. If it is that day qualifies as the possible beginning of a chain otherwise it has to be added to an existing chain. You probably need to do this first then use the dates of the applicable chain starts (and ends maybe...) as fields in the partition mentioned above.

HTH

David J.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Craig Ringer 2012-06-07 08:29:55 Re: Procedural Languages
Previous Message Manoj Agarwal 2012-06-07 03:30:38 Re: Problem while restoring a database from SQL_ASCII to UTF-8