Re: PL/Python adding support for multi-dimensional arrays

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Dave Cramer <pg(at)fastcrypt(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexey Grishchenko <programmerag(at)gmail(dot)com>
Subject: Re: PL/Python adding support for multi-dimensional arrays
Date: 2016-09-22 07:28:21
Message-ID: CAFj8pRD7ceicC0p68TzqyJGM541-M4V5yOqOA_68YWK3hDq+bg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

2016-09-21 19:53 GMT+02:00 Dave Cramer <pg(at)fastcrypt(dot)com>:

>
> On 18 September 2016 at 09:27, Dave Cramer <pg(at)fastcrypt(dot)com> wrote:
>
>>
>> On 10 August 2016 at 01:53, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
>> wrote:
>>
>>> Hi
>>>
>>> 2016-08-03 13:54 GMT+02:00 Alexey Grishchenko <agrishchenko(at)pivotal(dot)io>:
>>>
>>>> On Wed, Aug 3, 2016 at 12:49 PM, Alexey Grishchenko <
>>>> agrishchenko(at)pivotal(dot)io> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> Current implementation of PL/Python does not allow the use of
>>>>> multi-dimensional arrays, for both input and output parameters. This forces
>>>>> end users to introduce workarounds like casting arrays to text before
>>>>> passing them to the functions and parsing them after, which is an
>>>>> error-prone approach
>>>>>
>>>>> This patch adds support for multi-dimensional arrays as both input and
>>>>> output parameters for PL/Python functions. The number of dimensions
>>>>> supported is limited by Postgres MAXDIM macrovariable, by default equal to
>>>>> 6. Both input and output multi-dimensional arrays should have fixed
>>>>> dimension sizes, i.e. 2-d arrays should represent MxN matrix, 3-d arrays
>>>>> represent MxNxK cube, etc.
>>>>>
>>>>> This patch does not support multi-dimensional arrays of composite
>>>>> types, as composite types in Python might be represented as iterators and
>>>>> there is no obvious way to find out when the nested array stops and
>>>>> composite type structure starts. For example, if we have a composite type
>>>>> of (int, text), we can try to return "[ [ [1,'a'], [2,'b'] ], [ [3,'c'],
>>>>> [4,'d'] ] ]", and it is hard to find out that the first two lists are
>>>>> lists, and the third one represents structure. Things are getting even more
>>>>> complex when you have arrays as members of composite type. This is why I
>>>>> think this limitation is reasonable.
>>>>>
>>>>> Given the function:
>>>>>
>>>>> CREATE FUNCTION test_type_conversion_array_int4(x int4[]) RETURNS
>>>>> int4[] AS $$
>>>>> plpy.info(x, type(x))
>>>>> return x
>>>>> $$ LANGUAGE plpythonu;
>>>>>
>>>>> Before patch:
>>>>>
>>>>> # SELECT * FROM test_type_conversion_array_int
>>>>> 4(ARRAY[[1,2,3],[4,5,6]]);
>>>>> ERROR: cannot convert multidimensional array to Python list
>>>>> DETAIL: PL/Python only supports one-dimensional arrays.
>>>>> CONTEXT: PL/Python function "test_type_conversion_array_int4"
>>>>>
>>>>>
>>>>> After patch:
>>>>>
>>>>> # SELECT * FROM test_type_conversion_array_int
>>>>> 4(ARRAY[[1,2,3],[4,5,6]]);
>>>>> INFO: ([[1, 2, 3], [4, 5, 6]], <type 'list'>)
>>>>> test_type_conversion_array_int4
>>>>> ---------------------------------
>>>>> {{1,2,3},{4,5,6}}
>>>>> (1 row)
>>>>>
>>>>>
>>>>> --
>>>>> Best regards,
>>>>> Alexey Grishchenko
>>>>>
>>>>
>>>> Also this patch incorporates the fix for https://www.postgresql.org
>>>> /message-id/CAH38_tkwA5qgLV8zPN1OpPzhtkNKQb30n3xq-2NR9jUfv3q
>>>> wHA%40mail.gmail.com, as they touch the same piece of code - array
>>>> manipulation in PL/Python
>>>>
>>>>
>>> I am sending review of this patch:
>>>
>>> 1. The implemented functionality is clearly benefit - passing MD arrays,
>>> pretty faster passing bigger arrays
>>> 2. I was able to use this patch cleanly without any errors or warnings
>>> 3. There is no any error or warning
>>> 4. All tests passed - I tested Python 2.7 and Python 3.5
>>> 5. The code is well commented and clean
>>> 6. For this new functionality the documentation is not necessary
>>>
>>> 7. I invite more regress tests for both directions (Python <-> Postgres)
>>> for more than two dimensions
>>>
>>> My only one objection is not enough regress tests - after fixing this
>>> patch will be ready for commiters.
>>>
>>
Now, the tests are enough - so I'll mark this patch as ready for commiters.

I had to fix tests - there was lot of white spaces, and the result for
python3 was missing

Regards

Pavel

>
>>> Good work, Alexey
>>>
>>> Thank you
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>>
>>>> --
>>>> Best regards,
>>>> Alexey Grishchenko
>>>>
>>>
>>>
>>
>> Pavel,
>>
>> I will pick this up.
>>
>>
>> Pavel,
>
> Please see attached patch which provides more test cases
>
> I just realized this patch contains the original patch as well. What is
> the protocol for sending in subsequent patches ?
>
>>
>> Dave Cramer
>>
>> davec(at)postgresintl(dot)com
>> www.postgresintl.com
>>
>>
>

Attachment Content-Type Size
PL-Python-adding-support-for-multi-dimensional-arrays-20160922.patch text/x-patch 46.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2016-09-22 07:32:14 Re: Declarative partitioning - another take
Previous Message Heikki Linnakangas 2016-09-22 07:17:06 Re: Tuplesort merge pre-reading