Re: Oracle Analytical Functions

From: "Adam Rich" <adam(dot)r(at)sbcglobal(dot)net>
To: "'Willem Buitendyk'" <willem(at)pcfish(dot)ca>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Oracle Analytical Functions
Date: 2008-01-31 21:04:42
Message-ID: 014e01c8644c$e855d350$b90179f0$@r@sbcglobal.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


Ah, ok. I see what's happening. The data is retrieved from the tables,
and the sequence values are added, PRIOR to the order by, so that after
the order by, they are no longer sorted. (The same thing can happen
in Oracle with ROWNUM).

You can go the sorted view route, or just an inline view, like this:

select a.client_id, b.arbnum, a.arbnum as previousarbnum, (b.arbnum -
a.arbnum) as diffarbnum from
(select nextval('seq1') as s, * from (select client_id, arbnum from arb_test
order by client_id, arbnum OFFSET 0) as y OFFSET 0) as a
inner join (select nextval('seq2') as s, * from (select client_id, arbnum
from arb_test order by client_id, arbnum OFFSET 0)as z OFFSET 0) as b
on a.s=(b.s-1) where a.client_id=b.client_id;

> -----Original Message-----
> From: Willem Buitendyk [mailto:willem(at)pcfish(dot)ca]
> Sent: Thursday, January 31, 2008 2:48 PM
> To: Adam Rich
> Cc: pgsql-general(at)postgresql(dot)org
> Subject: Re: [GENERAL] Oracle Analytical Functions
>
> Here is a little test example. It seems that the second order by
> condition is not working - in this case datetime.
>
> create table arb_test (
> client_id integer,
> arbnum integer);
>
> insert into arb_test values (2,1);
> insert into arb_test values (2,33);
> insert into arb_test values (2,6);
> insert into arb_test values (2,76);
> insert into arb_test values (2,111);
> insert into arb_test values (2,10);
> insert into arb_test values (2,55);
> insert into arb_test values (7,12);
> insert into arb_test values (7,6);
> insert into arb_test values (7,144);
> insert into arb_test values (7,63);
> insert into arb_test values (7,87);
> insert into arb_test values (7,24);
> insert into arb_test values (7,22);
> insert into arb_test values (1,14);
> insert into arb_test values (1,23);
> insert into arb_test values (1,67);
> insert into arb_test values (1,90);
> insert into arb_test values (1,2);
> insert into arb_test values (1,5);
> insert into arb_test values (5,8);
> insert into arb_test values (5,42);
> insert into arb_test values (5,77);
> insert into arb_test values (5,9);
> insert into arb_test values (5,89);
> insert into arb_test values (5,23);
> insert into arb_test values (5,11);
>
> DROP SEQUENCE if exists seq1;
> DROP SEQUENCE if exists seq2;
> CREATE TEMPORARY SEQUENCE seq1 CACHE 1000;
> CREATE TEMPORARY SEQUENCE seq2 CACHE 1000;
>
> select a.client_id, b.arbnum, a.arbnum as previousarbnum, (b.arbnum -
> a.arbnum) as diffarbnum from
> (select nextval('seq1') as s, client_id, arbnum from arb_test
> order by client_id, arbnum OFFSET 0) as a
> inner join
> (select nextval('seq2') as s, client_id, arbnum from arb_test
> order by client_id, arbnum OFFSET 0) as b
> on a.s=(b.s-1) where a.client_id=b.client_id;
>
> --create or replace view arb_view as select * from arb_test order by
> client_id, arbnum;
>
> Here are the results:
>
> client_id | arbnum | previousarbnum | diffarbnum
> -----------+--------+----------------+------------
> 1 | 23 | 14 | 9
> 1 | 67 | 23 | 44
> 1 | 90 | 67 | 23
> 1 | 2 | 90 | -88
> 1 | 5 | 2 | 3
> 2 | 33 | 1 | 32
> 2 | 6 | 33 | -27
> 2 | 76 | 6 | 70
> 2 | 111 | 76 | 35
> 2 | 10 | 111 | -101
> 2 | 55 | 10 | 45
> 5 | 42 | 8 | 34
> 5 | 77 | 42 | 35
> 5 | 9 | 77 | -68
> 5 | 89 | 9 | 80
> 5 | 23 | 89 | -66
> 5 | 11 | 23 | -12
> 7 | 6 | 12 | -6
> 7 | 144 | 6 | 138
> 7 | 63 | 144 | -81
> 7 | 87 | 63 | 24
> 7 | 24 | 87 | -63
>
> When I used a sorted view:
>
> create or replace view arb_view as select * from arb_test order by
> client_id, arbnum;
>
> and redid it the results are:
>
> client_id | arbnum | previousarbnum | diffarbnum
> -----------+--------+----------------+------------
> 1 | 5 | 2 | 3
> 1 | 14 | 5 | 9
> 1 | 23 | 14 | 9
> 1 | 67 | 23 | 44
> 1 | 90 | 67 | 23
> 2 | 6 | 1 | 5
> 2 | 10 | 6 | 4
> 2 | 33 | 10 | 23
> 2 | 55 | 33 | 22
> 2 | 76 | 55 | 21
> 2 | 111 | 76 | 35
> 5 | 9 | 8 | 1
> 5 | 11 | 9 | 2
> 5 | 23 | 11 | 12
> 5 | 42 | 23 | 19
> 5 | 77 | 42 | 35
> 5 | 89 | 77 | 12
> 7 | 12 | 6 | 6
> 7 | 22 | 12 | 10
> 7 | 24 | 22 | 2
> 7 | 63 | 24 | 39
> 7 | 87 | 63 | 24
> 7 | 144 | 87 | 57
> (23 rows)
>
>
> This works the way it should.
>
> --drop table arb_test;
> --drop view arb_view;
>
> willem
> > The 'all_client_times' table has 753698 rows. The lagfunc() on the
> > sorted view returns 753576 rows and appears to work exactly as
> > needed. Using the function on an unsorted table returns only 686
> rows
> > and is missing a whole lot of data. Running the count query returns
> > 122 - which is correct as the amount of clients that I have. Each
> > client has between 5 - 7K records each.
> >
> > The way I see it is for each client there will be one row, namely,
> the
> > first in the series, that will not be included in the final results
> as
> > it would not have a previous time. With that in mind, if I take my
> > table row count as 753698 and minus the amount of clients I have,
> > 122, then I should get the number of results as 753576 which is
> > correct when I use your methods on a sorted table but which is not
> > correct when I use your methods on an unsorted table.
> >
> > willem
> >
> > Adam Rich wrote:
> >> Hi Willem,
> >>
> >>
> >>> for some reason the order by's aren't working.
> >>>
> >>
> >> Could you provide more details? Do you get a specific error
> message?
> >>
> >>
> >>> only returning 658 rows instead of the 750K.
> >>>
> >>
> >> You should not expect the same row count in both source table and
> >> result set. Even in your example -- you provided 8 source rows, and
> >> 4 result rows. You can determine the correct number of results via
> >> "the number of records, related to client_ids having two or more
> >> records in all_client_times, minus one". It may be true that you
> >> have 750k
> >> records but only 658 rows that satisfy this requirement.
> >>
> >> What do you get for this query?
> >>
> >> select count(*) from ( select client_id, count(*) as rows from
> >> all_client_times group by client_id having count(*) > 1 ) as x
> >>
> >>
> >> Adam
> >>
> >>
> >>
> >>
> >>
> >
> >
> > ---------------------------(end of broadcast)------------------------
> ---
> > TIP 4: Have you searched our list archives?
> >
> > http://archives.postgresql.org/
> >

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Andrej Ricnik-Bay 2008-01-31 21:04:53 Re: Is PostGreSql's Data storage mechanism "inferior"?
Previous Message Willem Buitendyk 2008-01-31 20:48:06 Re: Oracle Analytical Functions