From: | Michel Pelletier <pelletier(dot)michel(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Using Expanded Objects other than Arrays from plpgsql |
Date: | 2024-10-24 01:39:03 |
Message-ID: | CACxu=vJMqLUZR1N-AT4cXw9JHMsN=5-iYvPvKWy-CcKwz_9aBg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
On Wed, Oct 23, 2024 at 8:21 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Michel Pelletier <pelletier(dot)michel(at)gmail(dot)com> writes:
> > Here's another example:
>
> > CREATE OR REPLACE FUNCTION test2(graph matrix)
> > RETURNS bigint LANGUAGE plpgsql AS
> > $$
> > BEGIN
> > perform set_element(graph, 1, 1, 1);
> > RETURN nvals(graph);
> > end;
> > $$;
> > CREATE FUNCTION
> > postgres=# select test2(matrix('int32'));
> > DEBUG: new_matrix
> > DEBUG: matrix_get_flat_size
> > DEBUG: flatten_matrix
> > DEBUG: scalar_int32
> > DEBUG: new_scalar
> > DEBUG: matrix_set_element
> > DEBUG: DatumGetMatrix
> > DEBUG: expand_matrix
> > DEBUG: new_matrix
> > DEBUG: DatumGetScalar
> > DEBUG: matrix_get_flat_size
> > DEBUG: matrix_get_flat_size
> > DEBUG: flatten_matrix
> > DEBUG: context_callback_matrix_free
> > DEBUG: context_callback_scalar_free
> > DEBUG: matrix_nvals
> > DEBUG: DatumGetMatrix
> > DEBUG: expand_matrix
> > DEBUG: new_matrix
> > DEBUG: context_callback_matrix_free
> > DEBUG: context_callback_matrix_free
> > test2
> > -------
> > 0
> > (1 row)
>
> I'm a little confused by your debug output. What are "scalar_int32"
> and "new_scalar", and what part of the plpgsql function is causing
> them to be invoked?
>
GraphBLAS scalars hold a single element value for the matrix type.
Internally, they are simply 1x1 matrices (much like vectors are 1xn
matrices). The function signature is:
set_element(a matrix, i bigint, j bigint, s scalar)
There is a "CAST (integer as scalar)" function (scalar_int32) that casts
Postgres integers to GraphBLAS GrB_INT32 scalar elements (which calls
new_scalar because like vectors and matrices, they are expanded objects
which have a GrB_Scalar "handle"). Scalars are useful for working with
individual values, for example reduce() returns a scalar. There are way
more efficient ways to push huge C arrays of values into matrices but for
now I'm just working at the element level.
Another thing that confuses me is why there's a second flatten_matrix
> operation happening here. Shouldn't set_element return its result
> as a R/W expanded object?
>
That confuses me too, and my default assumption is always that I'm doing it
wrong. set_element does return a R/W object afaict, here is the return:
https://github.com/OneSparse/OneSparse/blob/main/src/matrix.c#L1726
where:
#define OS_RETURN_MATRIX(_matrix) return EOHPGetRWDatum(&(_matrix)->hdr)
> > I would expect that to return 1. If I do "graph = set_element(graph, 1,
> 1,
> > 1)" it works.
>
> I think you have a faulty understanding of PERFORM. It's defined as
> "evaluate this expression and throw away the result", so it's *not*
> going to change "graph", not even if set_element declares that
> argument as INOUT.
Faulty indeed, I was going from the plpgsql statement documentation:
"Sometimes it is useful to evaluate an expression or SELECT query but
discard the result, for example when calling a function that has
side-effects but no useful result value."
My understanding of "side-effects" was flawed there, but I'm fine with "x =
set_element(x...)" anyway as I was trying to follow the example of
array_append et al.
> (Our interpretation of OUT arguments for functions
> is that they're just an alternate notation for specifying the function
> result.) If you want to avoid the explicit assignment back to "graph"
> then the thing to do would be to declare set_element as a procedure,
> not a function, with an INOUT argument and then call it with CALL.
>
I'll stick with the assignment.
That's only cosmetically different though, in that the updated
> "graph" value is still passed back much as if it were a function
> result, and then the CALL infrastructure knows it has to assign that
> back to the argument variable. And, as I tried to explain earlier,
> that code path currently has no mechanism for avoiding making a copy
> of the graph somewhere along the line: it will pass "graph" to the
> procedure as either a flat Datum or a R/O expanded object, so that
> set_element will be required to copy that before modifying it.
>
Right, I'm still figuring out exactly what that code flow is. This is my
first dive into these corners of the code so thank you for being patient
with me. I promise to write up some expanded object documentation if this
works!
> We can imagine extending whatever we do for "x := f(x)" cases so that
> it also works during "CALL p(x)". But I think that's only going to
> yield cosmetic or notational improvements so I don't want to start
> with doing that. Let's focus first on improving the existing
> infrastructure for the f(x) case.
>
+1
-Michel
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-10-24 02:10:31 | Re: Using Expanded Objects other than Arrays from plpgsql |
Previous Message | Andy Hartman | 2024-10-23 23:59:26 | Re: Backup |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-10-24 01:57:14 | Re: Commutation of array SOME/ANY and ALL operators |
Previous Message | Matthew Morrissette Vance | 2024-10-24 01:27:59 | Commutation of array SOME/ANY and ALL operators |