Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array
Date: 2023-04-29 13:00:00
Message-ID: 5ebae5e4-d401-fadf-8585-ac3eaf53219c@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

28.04.2023 22:17, Tom Lane wrote:
> The real problem here is that we don't check that the list nesting
> depth is the same throughout the array: if lists are more deeply
> nested in later elements, we'll treat those sub-lists as scalars,
> leading to inconsistent results. Conversely, a less-deeply-nested
> list structure in a later element might still work, if it can be
> treated as a sequence. I think the second and third examples
> I gave should both throw errors.
>
> I also notice that the error messages in this area speak of "sequences",
> but it is more correct to call them "lists", because Python draws a
> distinction. (All lists are sequences, but not vice versa, eg a
> string is a sequence but not a list.)
>
> So I'm thinking about the attached.

Thanks for fixing this!
Now python's handling of arrays is much nicer and is aligned with the plperl's behaviour:
CREATE FUNCTION test_pl() RETURNS text[] AS $$ return [1, [2, 3]]; $$ LANGUAGE plperl;
SELECT * FROM test_pl();
ERROR:  multidimensional arrays must have array expressions with matching dimensions
CONTEXT:  PL/Perl function "test_pl"

I observed another light-hearted case (without the patch, of course):
CREATE FUNCTION test_py() RETURNS text[] AS $$ return [1, [2, 3]]; $$ LANGUAGE plpython3u;
SELECT * FROM test_py();
 {1,"[2, 3]"}

So the patch looks more like a bug fix.

Though I still see some discrepancy between plperl and plpython:
CREATE OR REPLACE FUNCTION test_pl() RETURNS text[] AS $$ return [[],1]; $$ LANGUAGE plperl;
SELECT * FROM test_pl();
ERROR:  multidimensional arrays must have array expressions with matching dimensions
CONTEXT:  PL/Perl function "test_pl"
vs
CREATE OR REPLACE FUNCTION test_py() RETURNS text[] AS $$ return [[],1]; $$ LANGUAGE plpython3u;
SELECT * FROM test_py();
 {[],1}

It seems that [] was recognized as "[]" here.

While playing with plperl, I found that it handles arrays not perfectly too:
CREATE OR REPLACE FUNCTION test_pl() RETURNS text[] AS $$return [1, []];$$ LANGUAGE plperl;
SELECT * FROM test_pl();
triggers:
==00:00:08:45.537 2325687== Conditional jump or move depends on uninitialised value(s)
==00:00:08:45.537 2325687==    at 0x61FEDC: construct_md_array (arrayfuncs.c:3500)
==00:00:08:45.537 2325687==    by 0x625917: makeMdArrayResult (arrayfuncs.c:5428)
==00:00:08:45.537 2325687==    by 0x486DEF3: plperl_array_to_datum (plperl.c:1278)
==00:00:08:45.537 2325687==    by 0x486D8E8: plperl_sv_to_datum (plperl.c:1347)
==00:00:08:45.537 2325687==    by 0x4872FA6: plperl_func_handler (plperl.c:2483)
==00:00:08:45.537 2325687==    by 0x4873CE5: plperl_call_handler (plperl.c:1858)
==00:00:08:45.537 2325687==    by 0x41406F: ExecMakeTableFunctionResult (execSRF.c:235)
==00:00:08:45.538 2325687==    by 0x426F1C: FunctionNext (nodeFunctionscan.c:95)
==00:00:08:45.538 2325687==    by 0x414AA7: ExecScanFetch (execScan.c:133)
==00:00:08:45.538 2325687==    by 0x414B42: ExecScan (execScan.c:182)
==00:00:08:45.538 2325687==    by 0x426E2E: ExecFunctionScan (nodeFunctionscan.c:270)
==00:00:08:45.538 2325687==    by 0x4117F1: ExecProcNodeFirst (execProcnode.c:464)
==00:00:08:45.538 2325687==

Here, nelems = 2 (from ArrayGetNItems(ndims, dims)), but
array_to_datum_internal() generated only one datum.

And yet another case:
CREATE OR REPLACE FUNCTION test_py() RETURNS text[] AS $$return [[1],[[]]];$$ LANGUAGE plpython3u;
regression=# SELECT * FROM test_py();
 {{1},{[]}}
vs
CREATE OR REPLACE FUNCTION test_pl() RETURNS text[] AS $$return [[1],[[]]];$$ LANGUAGE plperl;
SELECT * FROM test_pl();
 {}

> I do not propose this for
> back-patching, because it could break applications that work today.
> But it seems like good tightening-up for HEAD, or maybe we should
> wait for v17 at this point?

I suppose that waiting for v17 is preferable if the patch is considered as
bringing a new feature (it's not the case) or could require extra time to
stabilize. But I'm afraid that anomalies, that would require additional
fixes for the change stabilization, could be related to the existing
code, and thus that extra time will be invested in improving v16- too.

Best regards,
Alexander

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2023-04-29 17:16:30 Re: BUG #17912: Invalid memory access when converting plpython' array containing empty array
Previous Message Etsuro Fujita 2023-04-29 11:42:02 Re: BUG #17889: Invalid cursor direction for a foreign scan that reached the fetch_size (MOVE BACKWARD ALL IN cX)