Re: Underspecified window queries in regression tests

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Florian Pflug <fgp(at)phlo(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>
Subject: Re: Underspecified window queries in regression tests
Date: 2012-08-16 15:23:11
Message-ID: 20120816152311.GK8353@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


I have used your notes below to rewrite the Window function SQL manual
section. As you said, it was very hard to read. I now understand it
better, having restructured it, and I hope others do too.

After waiting 30 minutes for our developer doc build to refresh, I am
giving up and posting my own URL for the doc changes:

http://momjian.us/tmp/pgsql/sql-expressions.html#SYNTAX-WINDOW-FUNCTIONS

Perhaps I need to go back to having my own doc build.

---------------------------------------------------------------------------

On Mon, Oct 17, 2011 at 11:48:38AM +0200, Florian Pflug wrote:
> On Oct17, 2011, at 01:09 , Tom Lane wrote:
> > Florian Pflug <fgp(at)phlo(dot)org> writes:
> >> ... reading those parts again, I realize the it says "When ORDER BY is omitted
> >> the *default* frame consists ... ", and that the second quote is followed
> >> by a footnote which says
> >
> >> There are options to define the window frame in other ways, but this tutorial
> >> does not cover them. See Section 4.2.8 for details. [3.5, Window Functions]
> >
> >> So it was just me being thick. Sorry for the noise.
> >
> > Hmm. Maybe the use of a <footnote> there is too subtle, and we should
> > instead have that text in-line (probably in parentheses)? Or we could
> > use a <note>, but that's probably too much emphasis.
>
> Inline and in parentheses sounds fine.
>
> In addition, I think we should reword the explanation in 4.2.8 (The SQL Language
> / SQL Syntax / Value Expressions / Window Functions). Instead of that rather
> long (and IMHO hard to read) paragraph about possible frame clauses and their
> behaviour in the presence or absence of an ORDER BY clause, we should go with
> a more algorithmic explanation I think.
>
> Something along these lines maybe:
>
> ----------
> .) PARTITION BY splits the rows into disjoint partitions. All further processing
> happens only inside a single partition
>
> .) In RANGE mode, ORDER BY then splits each partition into an ordered list of
> sub-partitions, each containing rows which the ORDER BY considers to be
> equivalent.
>
> .) In ROWS mode, OTOH, each sub-partition contains only a single row. Thus, if
> there are rows which are considered to be equivalent by the ORDER BY, the
> ordering of the sub-partition isn't fully determined.
>
> .) Each row's frame then consists of some consecutive range of sub-partitions.
>
> .) In RANGE mode, that consecutive range can only start at either the first
> sub-partition or the current row's sub-partition, and can only end at either
> the current row's sub-partition or the last sub-partitions.
>
> .) In ROWS mode, the consecutive range may additional start <n> sub-partitions
> (or rows, it's the same thing here) before the current row, and may additionally
> end <m> sub-partitions/rows after the current row.
>
> >From that, it follows that even with an underspecified sort order, the contents of
> each frame are still fully determined in RANGE mode. The ordering of rows within
> a frame is not determined, though. So overall, in RANGE mode, a query's result is
> only non-deterministic if the window function is sensitive to the ordering of rows
> within a frame.
>
> In ROWS mode, OTOH, the contents each frame themselves are not fully determined,
> so even an ordering agnostic window function may produce non-deterministic results.
> ----------
>
> If you think that something along these lines would be an improvement, I can try
> to come up with a patch.
>
> best regards,
> Florian Pflug
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2012-08-16 15:24:55 Re: HeapTupleHeaderAdvanceLatestRemovedXid doing the wrong thing with multixacts
Previous Message Heikki Linnakangas 2012-08-16 15:19:39 Re: pgsql: In docs, change a few cases of "not important" to "unimportant".