Re: Fdw batch insert error out when set batch_size > 65535

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Fdw batch insert error out when set batch_size > 65535
Date: 2021-05-21 05:41:40
Message-ID: CALj2ACXg3tPAChzFd94_k_tP3x61QTVpuqKZke3M4ef78xEcuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, May 21, 2021 at 8:18 AM houzj(dot)fnst(at)fujitsu(dot)com
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> Hi,
>
> When reading the code, I noticed some possible issue about fdw batch insert.
> When I set batch_size > 65535 and insert more than 65535 rows into foreign table,
> It will throw an error:
>
> For example:
>
> ------------------
> CREATE FOREIGN TABLE vzdalena_tabulka2(a int, b varchar)
> SERVER omega_db
> OPTIONS (table_name 'tabulka', batch_size '65536');
>
> INSERT INTO vzdalena_tabulka2 SELECT i, 'AHOJ' || i FROM generate_series(1,65536) g(i);
>
> ERROR: number of parameters must be between 0 and 65535
> CONTEXT: remote SQL command: INSERT INTO public.tabulka(a, b) VALUES ($1, $2), ($3, $4)...
> ------------------
>
> Actually, I think if the (number of columns) * (number of rows) > 65535, then we can
> get this error. But, I think almost no one will set such a large value, so I think adjust the
> batch_size automatically when doing INSERT seems an acceptable solution.
>
> In the postgresGetForeignModifyBatchSize(), if we found the (num of param) * batch_size
> Is greater than the limit(65535), then we set it to 65535 / (num of param).
>
> Thoughts ?

+1 to limit batch_size for postgres_fdw and it's a good idea to have a
macro for the max params.

Few comments:
1) How about using macro in the error message, something like below?
appendPQExpBuffer(errorMessage,
libpq_gettext("number of parameters must be
between 0 and %d\n"),
PQ_MAX_PARAM_NUMBER);
2) I think we can choose not mention the 65535 because it's hard to
maintain if that's changed in libpq protocol sometime in future. How
about
"The final number of rows postgres_fdw inserts in a batch actually
depends on the number of columns and the provided batch_size value.
This is because of the limit the libpq protocol (which postgres_fdw
uses to connect to a remote server) has on the number of query
parameters that can be specified per query. For instance, if the
number of columns * batch_size is more than the limit, then the libpq
emits an error. But postgres_fdw adjusts the batch_size to avoid this
error."
instead of
+ overrides an option specified for the server. Note if the batch size
+ exceed the protocol limit (number of columns * batch_size > 65535),
+ then the actual batch size will be less than the specified batch_size.
3) I think "postgres_fdw should insert in each insert operation"
doesn't hold after this patch. We can reword it to "postgres_fdw
inserts in each insert operation".
This option specifies the number of rows
<filename>postgres_fdw</filename>
should insert in each insert operation. It can be specified for a
4) How about naming the macro as PQ_QUERY_PARAM_MAX_LIMIT?
5) We can use macro in code comments as well.
+ * 65535, so set the batch_size to not exceed limit in a batch insert.
6) I think both code and docs patches can be combined into a single patch.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2021-05-21 05:56:34 Re: Diagnostic comment in LogicalIncreaseXminForSlot
Previous Message Amit Kapila 2021-05-21 05:40:15 Re: parallel vacuum - few questions on docs, comments and code