[PROPOSAL]a new data type 'bytea' for ECPG

From: "Matsumura, Ryo" <matsumura(dot)ryo(at)jp(dot)fujitsu(dot)com>
To: "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: [PROPOSAL]a new data type 'bytea' for ECPG
Date: 2018-10-01 08:03:42
Message-ID: 03040DFF97E6E54E88D3BFEE5F5480F737A141F9@G01JPEXMBYT04
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Hackers

# This is my first post.

I will try to implement a new data type 'bytea' for ECPG.
I think that the implementation is not complicated.
Does anyone need it ?

* Why do I need bytea ?

Currently, ECPG program can treat binary data for bytea column with 'char' type
of C language, but it must convert from/to escaped format with PQunescapeBytea/
PQescapeBytea(). It forces users to add an unnecessary code and to pay cost for
the conversion in runtime.
# My PoC will not be able to solve output conversion cost.

I think that set/put data for host variable should be more simple.
The following is an example of Oracle Pro *C program for RAW type column.

VARCHAR raw_data[20];

/* preprocessed to the following
* struct
* {
* unsigned short len;
* unsigned char arr[20];
* } raw_data;
*/

raw_data.len = 10;
memcpy(raw_data.arr, data, 10);

see also:
https://docs.oracle.com/cd/E11882_01/appdev.112/e10825/pc_04dat.htm#i23305

In ECPG, varchar host variable cannot be used for bytea because it cannot treat
'\0' as part of data. If the length is set to 10 and there is '\0' at 3rd byte,
ecpglib truncates 3rd byte and later at the following:

[src/interfaces/ecpg/ecpglib/execute.c]
ecpg_store_input(const int lineno, const bool force_indicator, const struct
:
switch (var->type)
:
case ECPGt_varchar:
if (!(newcopy = (char *) ecpg_alloc(variable->len + 1, lineno)))
return false;
!! strncpy(newcopy, variable->arr, variable->len);
newcopy[variable->len] = '\0';

I also think that the behavior of varchar host variable should not be changed
because of compatibility.
Therefore, a new type of host variable 'bytea' is needed.

Since ecpglib can distinguish between C string and binary data, it can send
binary data to backend directly by using 'paramFormats' argument of PQexecParams().
Unfortunately, the conversion of output data cannot be omitted in ecpglib because
libpq doesn't provide like 'paramFormats'.
('resultFormat' means that *all* data from backend is formatted by binary or not.)

PQexecParams(PGconn *conn,
const char *command,
int nParams,
const Oid *paramTypes,
const char *const *paramValues,
const int *paramLengths,
!! const int *paramFormats,
int resultFormat)

* How to use new 'bytea' ?

ECPG programmers can use almost same as 'varchar' but cannot use as names.
(e.g. connection name, prepared statement name, cursor name and so on)

- Can use in Declare Section.

exec sql begin declare section;
bytea data1[512];
bytea data2[DATA_SIZE]; /* can use macro */
bytea send_data[DATA_NUM][DATA_SIZE]; /* can use two dimensional array */
bytea recv_data[][DATA_SIZE]; /* can use flexible array */
exec sql end declare section;

- Can *not* use for name.

exec sql begin declare section;
bytea conn_name[DATA_SIZE];
exec sql end declare section;

exec sql connect to :conn_name; !! error

- Conversion is not needed in user program.

exec sql begin declare section;
bytea send_buf[DATA_SIZE];
bytea recv_buf[DATA_SIZE - 13];
int ind_recv;
exec sql end declare section;

exec sql create table test (data1 bytea);
exec sql truncate test;
exec sql insert into test (data1) values (:send_buf);
exec sql select data1 into :recv_buf:ind_recv from test;
/* ind_recv is set to 13. */

* How to preprocess 'bytea' ?

'bytea' is preprocessed almost same as varchar.
The following is preprocessed to the next.

exec sql begin declare section;
bytea data[DATA_SIZE];
bytea send_data[DATA_NUM][DATA_SIZE];
bytea recv_data[][DATA_SIZE];
exec sql end declare section;

struct bytea_1 {int len; char arr[DATA_SIZE]} data;
struct bytea_2 {int len; char arr[DATA_SIZE]} send_data[DATA_NUM];
struct bytea_3 {int len; char arr[DATA_SIZE]} *recv_data;

Thank you for your consideration.

Regards
Ryo Matsumura

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arthur Zakirov 2018-10-01 09:22:06 Re: [PROPOSAL] Shared Ispell dictionaries
Previous Message Lukas Fittl 2018-10-01 07:32:18 Re: Query is over 2x slower with jit=on