Re: My "variable number of bind variables for dynamic SQL" solution. Comments?

From: Michael Moore <michaeljmoore(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: postgres list <pgsql-sql(at)postgresql(dot)org>
Subject: Re: My "variable number of bind variables for dynamic SQL" solution. Comments?
Date: 2016-05-26 22:59:14
Message-ID: CACpWLjNTDjz8TBY9Zoff2D6+Et1gH1QC5+8whY0Tm40U-Bu0bA@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-sql

On Wed, May 25, 2016 at 12:43 PM, David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:

> On Wed, May 25, 2016 at 2:36 PM, Michael Moore <michaeljmoore(at)gmail(dot)com>
> wrote:
>
>> I'm a postgres noob, so I am looking for advice/ comments.
>> My example code here demonstrates the solution to a real world situation
>> where I am converting a PL/SQL Package (Oracle) to a pgPL/SQL function. The
>> original package constructs an elaborate SELECT statement which can look
>> extremely different depending on the the values of various parameters and
>> the results of table lookups. The only thing that does not vary are the
>> columns of the constructed SELECT statement.
>>
>>
> ​The usual solution to $subject is to either use something like " value =
> ANY(array) " or populate a temporary table and write your SQL to join
> against that temporary table.​ You incorporate ANY into your query but you
> are trying to populate it dynamically. The construction of the array
> should be from a serialized input:
>
> 2 = ANY ( string_to_array( '1,2,3' , ',' )::bigint[] )
>
> fbind will be called by the function shown immediately after this
>>
>>> CREATE OR REPLACE FUNCTION fbind(
>>>
>>> IN p_psudo_datatype_i character varying,
>>>
>>> IN p_parameter_position_i int)
>>>
>>> RETURNS text AS
>>>
>>> $BODY$
>>>
>>> BEGIN
>>>
>>>
>>>> CASE
>>>> ​​
>>>> p_psudo_datatype_i
>>>
>>> WHEN 'cvs num' THEN
>>>
>>> return '= ANY
>>>> ((''{''||$'||p_parameter_position_i::text||'||''}'')::bigint[] ) '::text;
>>>
>>> WHEN 'bigint' THEN
>>>
>>> return '=
>>>> TO_NUMBER($'||p_parameter_position_i::text||',''99999999999999999999'')
>>>> '::text;
>>>
>>> ELSE
>>>
>>> return 'datatype not implemented'::text;
>>>
>>> END CASE;
>>>
>>>
>>>> END;$BODY$
>>>
>>> LANGUAGE plpgsql VOLATILE
>>>
>>> COST 100;
>>>
>>>
>> test driver procedure
>>
>> DO $$
>>>
>>> DECLARE
>>>
>>> sql text := 'select vendor_name from tx_vendor WHERE active_flag = ''Y''
>>>> ';
>>>
>>> v_bind_values text[];
>>>
>>> v_vendor_key bigint := 1017;
>>>
>>> v_cvs_code2tcode character varying (30) := '2222,5235,7979';
>>>
>>>
> ​csv...​
>
> v_vendor_name text;
>>>
>>> BEGIN
>>>
>>> v_bind_values := array_append(v_bind_values, v_vendor_key::text);
>>>
>>> sql := sql || '
>>>
>>> and vendor_key'||fbind('bigint'::text,array_length(v_bind_values,1));
>>>
>>>
>>>> -- in a real use situation, the following 2 lines of code may or may
>>>> not be
>>>
>>> -- executed, meaning, we will not know how many bind variables
>>>
>>> v_bind_values := array_append(v_bind_values,v_cvs_code2tcode::text);
>>>
>>> sql := sql || '
>>>
>>> and code2tcode'||fbind('cvs num'::text,array_length(v_bind_values,1));
>>>
>>>
>>>> case array_length(v_bind_values,1)
>>>
>>> when 1 then
>>>
>>> execute sql into strict v_vendor_name using v_bind_values[1];
>>>
>>> when 2 then
>>>
>>> execute sql into strict v_vendor_name using v_bind_values[1],
>>>> v_bind_values[2];
>>>
>>> else
>>>
>>> raise exception 'undefined number of bind variables' ;
>>>
>>> end case;
>>>
>>>
>>>> raise notice '
>>>
>>> sql:%', sql ;
>>>
>>> raise notice '
>>>
>>> v_vendor_name:%',v_vendor_name;
>>>
>>>
>>>> END$$;
>>>
>>>
>>
>
> ​sql := $$
> SELECT vendor_name FROM tx_vendor WHERE active_flag = 'Y'
> AND vendor_key = %L
> AND code2tcode = ANY(string_to_array(%L, ',')::bigint[])
> $$​
>
> ​EXECUTE format(sql, ​a_vendor_key, a_code_csv_string);
>
> ​or, if "a_code_csv_string" can already be an array (a_code_array) the
> following should work.
> ​
> ​​sql := $$
> SELECT vendor_name FROM tx_vendor WHERE active_flag = 'Y'
> AND vendor_key = $1
> AND code2tcode = ANY($2)::bigint[])
> $$​
>
> ​EXECUTE sql USING (a_vendor_key, a_code_array);
>
> ​To avoid the conditional you can always test the a_code_array...and make
> sure to pass in a meaningful default instead of nothing
>
> ​​sql := $$
> SELECT vendor_name FROM tx_vendor WHERE active_flag = 'Y'
> AND vendor_key = $1
> AND ((code2tcode = ANY($2)::bigint[]) OR (COALESCE($2::bigint[],
> ARRAY[]::bigint[]) = ARRAY[]::bigint[]))
> $$​
> ​
>
>> RUN results ... shows the SQL statement that was constructed
>>
>>> NOTICE:
>>>
>>> sql:select vendor_name from tx_vendor WHERE active_flag = 'Y'
>>>
>>> and vendor_key= TO_NUMBER($1,'99999999999999999999')
>>>
>>> and code2tcode= ANY (('{'||$2||'}')::bigint[] )
>>>
>>> NOTICE:
>>>
>>> v_vendor_name:Irwin Union Bank
>>>
>>>
>>>> Query returned successfully with no result in 15 msec.
>>>
>>>
>> All comments or suggestions for improvement are welcome.
>>
>>
> ​I'm done for now - hopefully that helps. Your indirection through fbind
> doesn't help - It seems like a poorly named and coded function. It should
> be two better-named functions the choice of which to call depending upon
> which ​p_psudo_datatype_i you would have passed in.
>
> The main thing to point out is the use of dollar-quoting and the format()
> function. I'm not positive you even need dynamic sql here but if you do
> those two capabilities in PostgreSQL made using it much more pleasant.
>
> You should also avoid writing stuff like:
>
> '{' || ... || '}')::bigint[]
>
> Though for numbers it isn't too bad. I'd much prefer to simply default to
> "string_to_array" to something equivalent (e.g., regexp_split_to_array),
> and only if you cannot get a hold of an actual array and pass it in
> directly. Your fbind function should be able to handle that computation so
> that users don't have to deal with these contortions.
>
> David J.
>
>
Great food for though David, thanks!

I tried the $ quoting and was not able to get it to work because I am
trying to also concatenate in the number portion of the bind variable ...

bind_number = 1;
​​sql := $qq$
SELECT vendor_name FROM tx_vendor WHERE active_flag = 'Y'
AND vendor_key = *$$qq$*||bind_number::text||$qq$
AND code2tcode = 123
$qq$​

The compiler does not like $$qq$ above. I suppose I could do:
bind_number = 1;
bind_str := '$'||bind_number::text;
​​sql := $qq$
SELECT vendor_name FROM tx_vendor WHERE active_flag = 'Y'
AND vendor_key = *$qq$*||bind_str||$qq$
AND code2tcode = 123
$qq$​

You point about "two better-named functions" is well taken. I will do as
you suggest.

Next point. I have concerns about using EXECUTE FORMAT. Specifically about
the number of cursors that would be generated as a result. Would using a
literal cause more cursors than using a bind variable?

1) select vendor_name from tVendor where credit = 'CAT'; -- cursor 1
2) select vendor_name from tVendor where credit = 'DOG'; -- cursor 2
3) select vendor_name from tVendor where credit = $1; -- can be any value,
reuses the same cursor

Perhaps Postgres is smart enough to reuse the same cursor for 1 and 2
above? In the real world, my function will be called thousands of times per
minute, so performance is critical.

Next
In reference to the part where you said:

> ​The usual solution to $subject is to either use something like " value =
> ANY(array) " or populate a temporary table and write your SQL to join
> against that temporary table.​ You incorporate ANY into your query but you
> are trying to populate it dynamically. The construction of the array
> should be from a serialized input:

I'm confused about what you meant by "solution to $subject ".
As for the "temporary table" approach, I have a concern that joining
additional tables might create a less than optimal execution plans
especially given that there are no stats for these temporary tables. I have
no idea if this concern is valid.

Per my best interpretation of your advice tempered by my understanding of
the application, here is my new approach:
--- THE TESTER

>

DO $$

DECLARE

-- simulate input parameters

v_csv_text character varying (30) := ' 2222 , aaaa , 79 Q9 ';

v_csv_bigint character varying (30) := ' 2222 , 3333 , 4444 ';

sql text;

--

rslt text;

v_bind_values text[];

parm_num int;

BEGIN

v_bind_values := array_append(v_bind_values , v_csv_text::text);

parm_num := array_length(v_bind_values,1);

sql :=

$z$select 'true' where '79 Q9' $z$|| fbind_csv_text(parm_num) ;

> v_bind_values := array_append(v_bind_values , v_csv_bigint::text);

parm_num := array_length(v_bind_values,1);

sql := SQL ||

$z$ and 3333 $z$|| fbind_csv_number(parm_num);

> raise notice '

sql:%',sql;

> execute sql into rslt using v_bind_values[1], v_bind_values[2];

raise notice 'result:%',rslt;

end$$;

------------- output
NOTICE:
sql:select 'true' where '79 Q9' = ANY(csv_text( $1)) and 3333 =
ANY(csv_bigint( $2))
NOTICE: result:true

Query returned successfully with no result in 20 msec.

----------SUB FUNCTIONS

> CREATE OR REPLACE FUNCTION fbind_csv_text(p_parameter_position_i integer)
> RETURNS text AS
> $BODY$
> BEGIN
> return '= ANY(csv_text( $'||p_parameter_position_i::text||'))'::text;
> END;$BODY$
> LANGUAGE plpgsql VOLATILE

> CREATE OR REPLACE FUNCTION fbind_csv_bigint(p_parameter_position_i integer)

RETURNS text AS

$BODY$

BEGIN

return '= ANY(csv_bigint( $'||p_parameter_position_i::text||'))'::text;

END;$BODY$

LANGUAGE plpgsql VOLATILE

COST 100;

> CREATE OR REPLACE FUNCTION csv_text(dirty_string text)

RETURNS text[] AS

$BODY$

BEGIN

--does trim per application requirement and returns text[] for text based
> ANY function

return string_to_array( string_agg(clean_node,','),',' ) FROM (

SELECT trim(regexp_split_to_table(dirty_string, ',')) AS clean_node)
> alias;

END;$BODY$

LANGUAGE plpgsql VOLATILE

CREATE OR REPLACE FUNCTION csv_bigint(dirty_string text)

RETURNS bigint[] AS

$BODY$

BEGIN

-- does TRIM, though technically not needed for numeric values,

-- and returns bigint[] for number based ANY function

-- This function can also serve as an EXIT in case there

-- is any future need to massage the input csv string for example

-- convert "555,'$1,234.09',777" into "555,1234.09,777" for numeric
> comparison.

return string_to_array( string_agg(clean_node,','),',' ) FROM (

SELECT trim(regexp_split_to_table(dirty_string, ',')) AS clean_node)
> alias;

> END;$BODY$

LANGUAGE plpgsql VOLATILE

As usual, comments are appreciated though not expected.
Thanks,
Mike

In response to

Responses

Browse pgsql-sql by date

  From Date Subject
Next Message David G. Johnston 2016-05-26 23:54:40 Re: My "variable number of bind variables for dynamic SQL" solution. Comments?
Previous Message Michael Moore 2016-05-26 00:26:08 Re: quoted csv to array pattern