Skip site navigation (1) Skip section navigation (2)

Re: Oracle Analytical Functions

From: Willem Buitendyk <willem(at)pcfish(dot)ca>
To: Adam Rich <adam(dot)r(at)sbcglobal(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Oracle Analytical Functions
Date: 2008-01-31 20:48:06
Message-ID: 47A23406.2000803@pcfish.ca (view raw or flat)
Thread:
Lists: pgsql-general
Here is a little test example.  It seems that the second order by 
condition is not working - in this case datetime.

create table arb_test (
client_id integer,
arbnum integer);

insert into arb_test values (2,1);
insert into arb_test values (2,33);
insert into arb_test values (2,6);
insert into arb_test values (2,76);
insert into arb_test values (2,111);
insert into arb_test values (2,10);
insert into arb_test values (2,55);
insert into arb_test values (7,12);
insert into arb_test values (7,6);
insert into arb_test values (7,144);
insert into arb_test values (7,63);
insert into arb_test values (7,87);
insert into arb_test values (7,24);
insert into arb_test values (7,22);
insert into arb_test values (1,14);
insert into arb_test values (1,23);
insert into arb_test values (1,67);
insert into arb_test values (1,90);
insert into arb_test values (1,2);
insert into arb_test values (1,5);
insert into arb_test values (5,8);
insert into arb_test values (5,42);
insert into arb_test values (5,77);
insert into arb_test values (5,9);
insert into arb_test values (5,89);
insert into arb_test values (5,23);
insert into arb_test values (5,11);

DROP SEQUENCE if exists seq1;
DROP SEQUENCE if exists seq2;
CREATE TEMPORARY SEQUENCE seq1 CACHE 1000;
CREATE TEMPORARY SEQUENCE seq2 CACHE 1000;

select a.client_id, b.arbnum, a.arbnum as previousarbnum, (b.arbnum -
a.arbnum) as diffarbnum from
(select nextval('seq1') as s, client_id, arbnum from arb_test
order by client_id, arbnum OFFSET 0) as a
inner join
(select nextval('seq2') as s, client_id, arbnum from arb_test
order by client_id, arbnum OFFSET 0) as b
on a.s=(b.s-1) where a.client_id=b.client_id;

--create or replace view arb_view as select * from arb_test order by 
client_id, arbnum;

Here are the results:

 client_id | arbnum | previousarbnum | diffarbnum
-----------+--------+----------------+------------
         1 |     23 |             14 |          9
         1 |     67 |             23 |         44
         1 |     90 |             67 |         23
         1 |      2 |             90 |        -88
         1 |      5 |              2 |          3
         2 |     33 |              1 |         32
         2 |      6 |             33 |        -27
         2 |     76 |              6 |         70
         2 |    111 |             76 |         35
         2 |     10 |            111 |       -101
         2 |     55 |             10 |         45
         5 |     42 |              8 |         34
         5 |     77 |             42 |         35
         5 |      9 |             77 |        -68
         5 |     89 |              9 |         80
         5 |     23 |             89 |        -66
         5 |     11 |             23 |        -12
         7 |      6 |             12 |         -6
         7 |    144 |              6 |        138
         7 |     63 |            144 |        -81
         7 |     87 |             63 |         24
         7 |     24 |             87 |        -63

When I used a sorted view:

create or replace view arb_view as select * from arb_test order by 
client_id, arbnum;

and redid it the results are:

 client_id | arbnum | previousarbnum | diffarbnum
-----------+--------+----------------+------------
         1 |      5 |              2 |          3
         1 |     14 |              5 |          9
         1 |     23 |             14 |          9
         1 |     67 |             23 |         44
         1 |     90 |             67 |         23
         2 |      6 |              1 |          5
         2 |     10 |              6 |          4
         2 |     33 |             10 |         23
         2 |     55 |             33 |         22
         2 |     76 |             55 |         21
         2 |    111 |             76 |         35
         5 |      9 |              8 |          1
         5 |     11 |              9 |          2
         5 |     23 |             11 |         12
         5 |     42 |             23 |         19
         5 |     77 |             42 |         35
         5 |     89 |             77 |         12
         7 |     12 |              6 |          6
         7 |     22 |             12 |         10
         7 |     24 |             22 |          2
         7 |     63 |             24 |         39
         7 |     87 |             63 |         24
         7 |    144 |             87 |         57
(23 rows)


This works the way it should. 

--drop table arb_test;
--drop view arb_view;

willem
> The 'all_client_times' table has 753698 rows.  The lagfunc() on the 
> sorted view returns 753576 rows and appears to work exactly as 
> needed.  Using the function on an unsorted table returns only 686 rows 
> and is missing a whole lot of data.  Running the count query returns 
> 122 - which is correct as the amount of clients that I have.  Each 
> client has between 5 - 7K  records each.
>
> The way I see it is for each client there will be one row, namely, the 
> first in the series, that will not be included in the final results as 
> it would not have a previous time.  With that in mind, if I take my 
> table row count as 753698 and minus the amount of clients I have, 
> 122,  then I should get the number of results as 753576 which is 
> correct when I use your methods on a sorted table but which is not 
> correct when I  use your methods on an unsorted table.
>
> willem
>
> Adam Rich wrote:
>> Hi Willem,
>>
>>  
>>> for some reason the order by's aren't working.
>>>     
>>
>> Could you provide more details?  Do you get a specific error message?
>>
>>  
>>> only returning 658 rows instead of the 750K.
>>>     
>>
>> You should not expect the same row count in both source table and
>> result set.  Even in your example -- you provided 8 source rows, and
>> 4 result rows.  You can determine the correct number of results via
>> "the number of records, related to client_ids having two or more 
>> records in all_client_times, minus one".  It may be true that you 
>> have 750k
>> records but only 658 rows that satisfy this requirement.
>>
>> What do you get for this query?
>>
>> select count(*) from ( select client_id, count(*) as rows from 
>> all_client_times group by client_id having count(*) > 1 ) as x
>>
>>
>> Adam
>>
>>
>>
>>
>>   
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>               http://archives.postgresql.org/
>


In response to

Responses

pgsql-general by date

Next:From: Adam RichDate: 2008-01-31 21:04:42
Subject: Re: Oracle Analytical Functions
Previous:From: Alvaro HerreraDate: 2008-01-31 20:39:25
Subject: Re: Log file permissions?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group