Re: Issues with generate_series using integer boundaries

From: Thom Brown <thom(at)linux(dot)com>
To: Alban Hertroys <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PGSQL Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Issues with generate_series using integer boundaries
Date: 2011-02-03 13:32:51
Message-ID: AANLkTim0YSoXtWgggrg9om+M-kg0ztFj2MwsSt0ahJ1A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On 3 February 2011 11:34, Thom Brown <thom(at)linux(dot)com> wrote:
> On 3 February 2011 11:31, Thom Brown <thom(at)linux(dot)com> wrote:
>> On 1 February 2011 23:08, Thom Brown <thom(at)linux(dot)com> wrote:
>>> On 1 February 2011 21:32, Alban Hertroys
>>> <dalroi(at)solfertje(dot)student(dot)utwente(dot)nl> wrote:
>>>> On 1 Feb 2011, at 21:26, Thom Brown wrote:
>>>>
>>>>> On 1 February 2011 01:05, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>>>> Thom Brown <thom(at)linux(dot)com> writes:
>>>>>>> I've noticed that if I try to use generate_series to include the upper
>>>>>>> boundary of int4, it never returns:
>>>>>>
>>>>>> I'll bet it's testing "currval > bound" without considering the
>>>>>> possibility that incrementing currval caused an overflow wraparound.
>>>>>> We fixed a similar problem years ago in plpgsql FOR-loops...
>>>>>
>>>>> Yes, you're right.  Internally, the current value is checked against
>>>>> the finish.  If it hasn't yet passed it, the current value is
>>>>> increased by the step.  When it reaches the upper bound, since it
>>>>> hasn't yet exceeded the finish, it proceeds to increment it again,
>>>>> resulting in the iterator wrapping past the upper bound to become the
>>>>> lower bound.  This then keeps it looping from the lower bound upward,
>>>>> so the current value stays well below the end.
>>>>
>>>>
>>>> That could actually be used as a feature to create a repeating series. A bit more control would be useful though :P
>>>
>>> I don't quite understand why the code works.  As I see it, it always
>>> returns a set with values 1 higher than the corresponding result.  So
>>> requesting 1 to 5 actually returns 2 to 6 internally, but somehow it
>>> correctly shows 1 to 5 in the query output.  If there were no such
>>> discrepancy, the upper-bound/lower-bound problem wouldn't exist, so
>>> not sure how those output values result in the correct query result
>>> values.
>>
>> Okay, I've attached a patch which fixes it.  It allows ranges up to
>> upper and down to lower bounds as well as accounting for the
>> possibility for the step to cause misalignment of the iterating value
>> with the end value.  The following now works which would usually get
>> stuck in a loop:
>>
>> postgres=# SELECT x FROM generate_series(2147483643::int4,
>> 2147483647::int4) AS a(x);
>>     x
>> ------------
>>  2147483643
>>  2147483644
>>  2147483645
>>  2147483646
>>  2147483647
>> (5 rows)
>>
>> postgres=# SELECT x FROM generate_series(2147483642::int4,
>> 2147483647::int4, 2) AS a(x);
>>     x
>> ------------
>>  2147483642
>>  2147483644
>>  2147483646
>> (3 rows)
>>
>> postgres=# SELECT x FROM generate_series(2147483643::int4,
>> 2147483647::int4, 6) AS a(x);
>>     x
>> ------------
>>  2147483643
>> (1 row)
>>
>>
>> It's probably safe to assume the changes in the patch aren't up to
>> scratch and it's supplied for demonstration purposes only, so could
>> someone please use the same principals and code in the appropriate
>> changes?
>>
>> Thanks
>>
>
> And I see I accidentally included a doc change in there.  Removed and
> reattached:

Actually, further testing indicates this causes other problems:

postgres=# SELECT x FROM generate_series(1, 9,-1) AS a(x);
x
---
1
(1 row)

Should return no rows.

postgres=# SELECT x FROM generate_series(1, 9,3) AS a(x);
x
----
1
4
7
10
(4 rows)

Should return 3 rows.

--
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Thom Brown 2011-02-03 13:58:04 Re: Issues with generate_series using integer boundaries
Previous Message Sim Zacks 2011-02-03 12:51:41 Re: Database Design Question

Browse pgsql-hackers by date

  From Date Subject
Next Message Kenneth Marshall 2011-02-03 13:41:42 Re: [HACKERS] Slow count(*) again...
Previous Message Magnus Hagander 2011-02-03 12:56:48 pgsql: Include more status information in walsender results