Re: Fdw batch insert error out when set batch_size > 65535

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Fdw batch insert error out when set batch_size > 65535
Date: 2021-06-08 18:34:28
Message-ID: 68b4d5bd-d9fd-4337-e677-0bc5b3b7db6f@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/31/21 6:01 AM, Bharath Rupireddy wrote:
> On Mon, May 31, 2021 at 1:21 AM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> Hi,
>>
>> I took at this patch today. I did some minor changes, mostly:
>>
>> 1) change the code limiting batch_size from
>>
>> if (fmstate->p_nums > 0 &&
>> (batch_size * fmstate->p_nums > PQ_QUERY_PARAM_MAX_LIMIT))
>> {
>> batch_size = PQ_QUERY_PARAM_MAX_LIMIT / fmstate->p_nums;
>> }
>>
>> to
>>
>> if (fmstate && fmstate->p_nums > 0)
>> batch_size = Min(batch_size,
>> PQ_QUERY_PARAM_MAX_LIMIT / fmstate->p_nums);
>>
>> which I think is somewhat clearer / more common patter.
>
> Agree, that looks pretty good.
>
>> 2) I've reworded the docs a bit, splitting the single para into two. I
>> think this makes it clearer.
>
> LGTM, except one thing that the batch_size description says "should
> insert in", but it's not that the value entered for batch_size is
> always honoured right? Because this patch might it.
>
> This option specifies the number of rows
> <filename>postgres_fdw</filename>
> should insert in each insert operation. It can be specified for a
>
> So, I suggest to remove "should" and change it to:
>
> This option specifies the number of rows
> <filename>postgres_fdw</filename>
> inserts in each insert operation. It can be specified for a
>

I think the "should" indicates exactly that postgres_fdw may adjust the
batch size. Without it it sounds much more definitive, so I kept it.

>> Attached is a patch doing this. Please check the commit message etc.
>> Barring objections I'll get it committed in a couple days.
>
> One minor comment:
> In the commit message, Int16 is used
> The FE/BE protocol identifies parameters with an Int16 index, which
> limits the maximum number of parameters per query to 65535. With
>
> and in the code comments uint16 is used.
> + * parameters in a batch is limited to 64k (uint16), so make sure we don't
>
> Isn't it uint16 in the commit message too? Also, can we use 64k in the
> commit message instead of 65535?
>

No, the "Int16" refers to the FE/BE documentation, where we use Int16.
But in the C code we interpret it as uint16.

I've added a simple regression test to postgres_fdw, testing that batch
sizes > 65535 work fine, and pushed the fix.

I've considered checking the value in postgres_fdw_validator and just
rejecting anything over 65535, but I've decided against that. We'd still
need to adjust depending on number of columns anyway.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Schneider 2021-06-08 18:35:57 Re: logical decoding bug: segfault in ReorderBufferToastReplace()
Previous Message Alvaro Herrera 2021-06-08 18:01:51 Re: pg14b1 stuck in lazy_scan_prune/heap_page_prune of pg_statistic