Re: Issues with generate_series using integer boundaries

From: Thom Brown <thom(at)linux(dot)com>
To: Alban Hertroys <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PGSQL Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Issues with generate_series using integer boundaries
Date: 2011-02-03 11:31:01
Message-ID: AANLkTim-BBN5LgJNgwWFcOrDsm8uKhzP2KR1SQdCqfpF@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On 1 February 2011 23:08, Thom Brown <thom(at)linux(dot)com> wrote:
> On 1 February 2011 21:32, Alban Hertroys
> <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl> wrote:
>> On 1 Feb 2011, at 21:26, Thom Brown wrote:
>>
>>> On 1 February 2011 01:05, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> Thom Brown <thom(at)linux(dot)com> writes:
>>>>> I've noticed that if I try to use generate_series to include the upper
>>>>> boundary of int4, it never returns:
>>>>
>>>> I'll bet it's testing "currval > bound" without considering the
>>>> possibility that incrementing currval caused an overflow wraparound.
>>>> We fixed a similar problem years ago in plpgsql FOR-loops...
>>>
>>> Yes, you're right.  Internally, the current value is checked against
>>> the finish.  If it hasn't yet passed it, the current value is
>>> increased by the step.  When it reaches the upper bound, since it
>>> hasn't yet exceeded the finish, it proceeds to increment it again,
>>> resulting in the iterator wrapping past the upper bound to become the
>>> lower bound.  This then keeps it looping from the lower bound upward,
>>> so the current value stays well below the end.
>>
>>
>> That could actually be used as a feature to create a repeating series. A bit more control would be useful though :P
>
> I don't quite understand why the code works.  As I see it, it always
> returns a set with values 1 higher than the corresponding result.  So
> requesting 1 to 5 actually returns 2 to 6 internally, but somehow it
> correctly shows 1 to 5 in the query output.  If there were no such
> discrepancy, the upper-bound/lower-bound problem wouldn't exist, so
> not sure how those output values result in the correct query result
> values.

Okay, I've attached a patch which fixes it. It allows ranges up to
upper and down to lower bounds as well as accounting for the
possibility for the step to cause misalignment of the iterating value
with the end value. The following now works which would usually get
stuck in a loop:

postgres=# SELECT x FROM generate_series(2147483643::int4,
2147483647::int4) AS a(x);
x
------------
2147483643
2147483644
2147483645
2147483646
2147483647
(5 rows)

postgres=# SELECT x FROM generate_series(2147483642::int4,
2147483647::int4, 2) AS a(x);
x
------------
2147483642
2147483644
2147483646
(3 rows)

postgres=# SELECT x FROM generate_series(2147483643::int4,
2147483647::int4, 6) AS a(x);
x
------------
2147483643
(1 row)

It's probably safe to assume the changes in the patch aren't up to
scratch and it's supplied for demonstration purposes only, so could
someone please use the same principals and code in the appropriate
changes?

Thanks

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

Attachment Content-Type Size
generate_series_fix.patch application/octet-stream 2.2 KB

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Thom Brown 2011-02-03 11:34:45 Re: Issues with generate_series using integer boundaries
Previous Message hubert depesz lubaczewski 2011-02-03 11:23:19 Re: Select Command in Procedures

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2011-02-03 11:34:45 Re: Issues with generate_series using integer boundaries
Previous Message Magnus Hagander 2011-02-03 10:27:16 Re: Typo in create user mapping docs page