RE: [PROPOSAL]a new data type 'bytea' for ECPG

From: "Matsumura, Ryo" <matsumura(dot)ryo(at)jp(dot)fujitsu(dot)com>
To: 'Michael Meskes' <meskes(at)postgresql(dot)org>
Cc: "Tsunakawa, Takayuki" <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: [PROPOSAL]a new data type 'bytea' for ECPG
Date: 2019-02-13 09:58:33
Message-ID: 03040DFF97E6E54E88D3BFEE5F5480F737AA66EC@G01JPEXMBYT04
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Meskes-san

At first, I find my mistake that the following member is not used in my patch.
Sorry...

[ecpglib_extern.h]
120 struct descriptor_item
130 int data_len;

> Why is handling a bytea so different from handling a varchar?

Current architecture:
Internal expression of varchar is C-string that includes length information implicitly
because the length can be computed by strlen().
ECPGdo(ecpg_build_params) assumes that the data in descriptor is C-string encoded.

In other hand, bytea data is binary that doesn't include any length information.
And the merit of my proposal is that bytea data can be sent to backend without
C-string encodeing overhead. They are different points from varchar.

I try to explain current data flow and my ideas.
# It may not be simple explanation...

Current data flow:

> /* exec sql set descriptor idesc value 1 data = :binary_var; */
> { ECPGset_desc(__LINE__, "idesc", 1,ECPGd_data,
> ECPGt_bytea,&(binary_var),(long)DATA_SIZE,(long)1,sizeof(struct bytea_1), ECPGd_EODT);

Ecpglib stores user data into [struct descriptor_item].
Ecpglib stores only C-string encoded data to 'data' member with ecpg_store_input.
Of course, if user specifies length(*), ecpg_store_input strncpy() with the length.
(*)len member of struct varchar_1 { int len; char arr[ DATA_SIZE ]; }

# desctiptor_item has 'type' and 'length' member. But the above statement doesn't set
# these fields because I think they should be set by user explicitly as the following:
# exec sql set descriptor idesc value 1 length = 3;
# I explain later that the above user statement is ignored in result.

> /* exec sql execute ins_stmt using sql descriptor idesc; */
> { ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_execute, "ins_stmt",
> ECPGt_descriptor, "idesc", 1L, 1L, 1L,
> ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EOIT, ECPGt_EORT);

ecpg_build_params, the first step of ECPGdo, only strcpy() from descriptor_item.data to
tobeinserted by ecpg_store_input because the input [struct variable] for ecpg_store_input
is always set type='ECPGt_char'. descriptor_item.type and descriptor_item.length are
always not used.

# varcharsize member is set to value of strlen(descriptor_item.data) but it's ignored
# by ecpg_store_input.

In that flow, how user binary data is set to tobeinserted without C-string encoding?
The premise are the followings:
- The length information set by user must be inform upto ecpg_build_params.
- The media of the length information from ECPGset_desc to ECPGdo is only [struct descriptor_item].

My Idea-1 in the previous mail is that:
- ECPGset_desc copies whole of the struct(*) to descriptor_item.data and sets type
information to descriptor_item.is_binary.
(*)bytea_a { int len; char arr[DATA_SIZE]; }
- ecpg_build_params calls ecpg_store_input for the descriptor_item.data just as
the folowing input variable.

execute sql insert into foo values(:binary_var);

My Idea-2 is that:
- ECPGset_desc copies data to descriptor_item.data, set the length to
dscriptor_item.data_len and set type information to descriptor_item.is_binary.
- ecpg_build_params only memcpy as folowing without ecpg_store_input:

if (descriptor_item.is_binary)
memcpy(&tobeinserted, descriptor_item.data, descriptor_item.data_len)

Thank you.

Ryo Matsumura

> -----Original Message-----
> From: Michael Meskes [mailto:meskes(at)postgresql(dot)org]
> Sent: Tuesday, February 12, 2019 11:06 PM
> To: Matsumura, Ryo/松村 量 <matsumura(dot)ryo(at)jp(dot)fujitsu(dot)com>
> Cc: Tsunakawa, Takayuki/綱川 貴之 <tsunakawa(dot)takay(at)jp(dot)fujitsu(dot)com>;
> pgsql-hackers(at)lists(dot)postgresql(dot)org
> Subject: Re: [PROPOSAL]a new data type 'bytea' for ECPG
>
> Matsumura-san,
>
> > I try to explain as follows. I would like to receive your comment.
> > ...
>
> I'm afraid I don't really understand your explanation. Why is handling
> a bytea so different from handling a varchar? I can see some
> differences due to its binary nature, but I do not understand why it
> needs so much special handling for stuff like its length? There is a
> length field in the structure but instead of using it the data field is
> used to store both, the length and the data. What am I missing?
>
> Please keep in mind that I did not write the descriptor code, so I may
> very well not see the obvious.
>
> Michael
> --
> Michael Meskes
> Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
> Meskes at (Debian|Postgresql) dot Org
> Jabber: michael at xmpp dot meskes dot org
> VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Chris Travers 2019-02-13 11:08:50 Re: Prevent extension creation in temporary schemas
Previous Message Daniel Gustafsson 2019-02-13 09:52:27 Re: 2019-02-14 Press Release Draft