Re: PL/Python adding support for multi-dimensional arrays

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Alexey Grishchenko <agrishchenko(at)pivotal(dot)io>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Alexey Grishchenko <programmerag(at)gmail(dot)com>
Subject: Re: PL/Python adding support for multi-dimensional arrays
Date: 2016-08-10 05:53:13
Message-ID: CAFj8pRDiFLYYmE7CAqA_6k6VZOa7tfpH3AM4nhrHQvz-dtCMLA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

2016-08-03 13:54 GMT+02:00 Alexey Grishchenko <agrishchenko(at)pivotal(dot)io>:

> On Wed, Aug 3, 2016 at 12:49 PM, Alexey Grishchenko <
> agrishchenko(at)pivotal(dot)io> wrote:
>
>> Hi
>>
>> Current implementation of PL/Python does not allow the use of
>> multi-dimensional arrays, for both input and output parameters. This forces
>> end users to introduce workarounds like casting arrays to text before
>> passing them to the functions and parsing them after, which is an
>> error-prone approach
>>
>> This patch adds support for multi-dimensional arrays as both input and
>> output parameters for PL/Python functions. The number of dimensions
>> supported is limited by Postgres MAXDIM macrovariable, by default equal to
>> 6. Both input and output multi-dimensional arrays should have fixed
>> dimension sizes, i.e. 2-d arrays should represent MxN matrix, 3-d arrays
>> represent MxNxK cube, etc.
>>
>> This patch does not support multi-dimensional arrays of composite types,
>> as composite types in Python might be represented as iterators and there is
>> no obvious way to find out when the nested array stops and composite type
>> structure starts. For example, if we have a composite type of (int, text),
>> we can try to return "[ [ [1,'a'], [2,'b'] ], [ [3,'c'], [4,'d'] ] ]", and
>> it is hard to find out that the first two lists are lists, and the third
>> one represents structure. Things are getting even more complex when you
>> have arrays as members of composite type. This is why I think this
>> limitation is reasonable.
>>
>> Given the function:
>>
>> CREATE FUNCTION test_type_conversion_array_int4(x int4[]) RETURNS int4[]
>> AS $$
>> plpy.info(x, type(x))
>> return x
>> $$ LANGUAGE plpythonu;
>>
>> Before patch:
>>
>> # SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
>> ERROR: cannot convert multidimensional array to Python list
>> DETAIL: PL/Python only supports one-dimensional arrays.
>> CONTEXT: PL/Python function "test_type_conversion_array_int4"
>>
>>
>> After patch:
>>
>> # SELECT * FROM test_type_conversion_array_int4(ARRAY[[1,2,3],[4,5,6]]);
>> INFO: ([[1, 2, 3], [4, 5, 6]], <type 'list'>)
>> test_type_conversion_array_int4
>> ---------------------------------
>> {{1,2,3},{4,5,6}}
>> (1 row)
>>
>>
>> --
>> Best regards,
>> Alexey Grishchenko
>>
>
> Also this patch incorporates the fix for https://www.postgresql.
> org/message-id/CAH38_tkwA5qgLV8zPN1OpPzhtkNKQb30n3x
> q-2NR9jUfv3qwHA%40mail.gmail.com, as they touch the same piece of code -
> array manipulation in PL/Python
>
>
I am sending review of this patch:

1. The implemented functionality is clearly benefit - passing MD arrays,
pretty faster passing bigger arrays
2. I was able to use this patch cleanly without any errors or warnings
3. There is no any error or warning
4. All tests passed - I tested Python 2.7 and Python 3.5
5. The code is well commented and clean
6. For this new functionality the documentation is not necessary

7. I invite more regress tests for both directions (Python <-> Postgres)
for more than two dimensions

My only one objection is not enough regress tests - after fixing this patch
will be ready for commiters.

Good work, Alexey

Thank you

Regards

Pavel

> --
> Best regards,
> Alexey Grishchenko
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2016-08-10 05:57:50 Re: [sqlsmith] Failed assertion in joinrels.c
Previous Message Michael Paquier 2016-08-10 05:24:48 Re: Small issues in syncrep.c