================================================================================
                                    NORMALIZE
================================================================================

This is an exemplary walkthrough of a temporal NORMALIZE operation, from the
query input, over the query rewrite, until the result tuple outputs during
execution.


EXAMPLE
--------------------------------------------------------------------------------

Question: How many employees work in a department at each time point?

CREATE TABLE empl (name VARCHAR, dept VARCHAR, time INT4RANGE);
INSERT INTO empl VALUES
('Ann', 'DB', '[1,5)'),
('Ann', 'AI', '[3,8)'),
('Bea', 'DB', '[3,9)');


Timeline representation:

    NAME  DEPT      TIME
    Ann   DB        [1,5)       ----        <- Tuple 1, valid from 1 to 5 excl.
    Ann   AI        [3,8)         -----     <- Tuple 2
    Bea   DB        [3,9)         ------    <- Tuple 3
                                123456789   <- Timeline


NORMALIZE query:

    SELECT count(*), dept, time
    FROM (
        empl a NORMALIZE empl b USING(dept) WITH (a.time, b.time)
    ) c
    GROUP BY dept, time;


Result:

    COUNT  DEPT     TIME
    1      DB       [1,3)       --
    1      AI       [3,8)         -----
    2      DB       [3,5)         --
    1      DB       [5,9)           ----
                                123456789  <- Timeline


STEP-BY-STEP EXPLANATION
--------------------------------------------------------------------------------

After receiving the NORMALIZE query (see above) as input, we rewrite it into
the following query:

    SELECT a.*, P1
    FROM (
	    SELECT *, row_id() OVER () rn FROM empl
    ) a
    LEFT OUTER JOIN (
	    SELECT dept, LOWER(time) P1 FROM empl
	    UNION
	    SELECT dept, UPPER(time) P1 FROM empl
    ) b
    ON a.dept = b.dept AND P1 <@ a.time
    ORDER BY rn, P1;


Intermediate result: This is the input of our NORMALIZE execution node...
(See Appendix [1] for details)

     TUPID   name | dept | time  | rn | p1
            ------+------+-------+----+----
       A     Ann  | DB   | [1,5) |  1 |  1
       B     Ann  | DB   | [1,5) |  1 |  3
       C     Ann  | AI   | [3,8) |  2 |  3
       D     Bea  | DB   | [3,9) |  3 |  3
       E     Bea  | DB   | [3,9) |  3 |  5


We have three possibilities inside NORMALIZE during execution:
(See Appendix [2] for details)

    1) CURR == PREV
       a) S < CURR.P1   --> generate (CURR, [S, CURR.P1)) and set S = CURR.P1
       b) otherwise     --> fetch next tuple
          if fetched tuple is NULL --> goto(2) forcing CURR != PREV to be true
    2) CURR != PREV
       a) S < PREV.te   --> generate (PREV, [S, PREV.te))
       ... set S = CURR.ts and goto(1) forcing CURR == PREV to be true


Executor steps for our example query:

1) First call of ExecTemporalAdjustment: Fetch tuple A, which is now CURR.
   Set sweepline S = 1 (A.ts), and copy tuple A also into PREV. Goto(1)
   forcing CURR == PREV to be true.

2) (1): CURR == PREV, that is, tuple A is equal to itself.
   (1b): Fetch tuple B; CURR = B, PREV = A

3) (1): A == B
   (1a): S < B.P1 (1 < 3), hence we generate a result tuple:
        (Ann, DB, [1,3))
   ...where the interval is [S, CURR.P1) and all other attributes are
   copied from the current tuple (omitting helper columns: RN and P1)
   We update the sweepline: S = 3 (B.P1).

4) (1): A == B.
   (1b): Fetch tuple C; CURR = C, PREV = B

5) (2): B != C.
   (2a): S < B.te (3 < 5), hence we generate a result tuple:
        (Ann, DB, [3, 5))
   We update the sweepline: S = 3 (C.ts).

6) (1): B == C (forced).
   (1b): Fetch tuple D; CURR = D, PREV = C

7) (2): C != D.
   (2a): S < C.te (3 < 8), hence we generate a result tuple:
        (Ann, AI, [3,8))
   We update the sweepline: S = 3 (D.ts).

8) (1): C == D (forced).
   (1b): Fetch tuple E; CURR = E, PREV = D

9) (1): D == E.
   (1a): S < E.P1 (3 < 5), hence we generate a result tuple:
        (Bea, DB, [3, 5))
   We update the sweepline: S = 5 (E.P1).

10) (1): D == E.
    (1b): Fetch tuple gives NULL

11) (2): D != E (forced).
    (2a): S < E.te (5 < 9), hence we generate a result tuple:
        (Bea, DB, [5, 9))
    We update the sweepline: S = 3 (E.ts).

No new tuple can be fetched, nor is any new result tuple produced, hence we
terminate our executor.

The result of NORMALIZE is as follows:

     name | dept | time
    ------+------+-------
     Ann  | DB   | [1,3)
     Ann  | DB   | [3,5)
     Ann  | AI   | [3,8)
     Bea  | DB   | [3,5)
     Bea  | DB   | [5,9)

The initial goal was to aggregate over "dept" and "time", and output the count
of each. These can now be easily done from this normalized input with standard
(reused) PostgreSQL operators.


Final result is:

     count | dept | time
    -------+------+-------
     1     | DB   | [1,3)
     1     | AI   | [3,8)
     2     | DB   | [3,5)
     1     | DB   | [5,9)


Best regards,
Anton, Michael, Johann, Peter


--------------------------------------------------------------------------------
Appendix:

[1]
After the intermediate result we get an input, which is splitted into so-called
temporal groups. Each of these groups satisfy the USING-condition (a.dept =
b.dept), has overlapping periods, and a row number as ID (rn). The input is
ordered by temporal group ID, and the start and ending timepoints, i.e., P1.
Temporal normalizers do not make a distinction between start and end timepoints
while grouping, therefore we have only one timepoint attribute here (i.e., P1),
which is the union of start and end timepoints. Please note, that each tuple in
a certain rn ID is always equal to any other tuple in that rn ID.

[2]
Additinal information for NORMALIZE:
    - sweepline: to sweep from left-to-right on the time line (S)
    - slots to the current and previous tuple (CURR, PREV)
    - Two tuples A and B are equal, i.e. A == B, if A.rn == B.rn
    - ts means LOWER(time), and te means UPPER(time)