RE: Fdw batch insert error out when set batch_size > 65535

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Fdw batch insert error out when set batch_size > 65535
Date: 2021-05-21 07:49:55
Message-ID: OS0PR01MB5716BFBD760428859058FCBD94299@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Sent: Friday, May 21, 2021 1:42 PM
> On Fri, May 21, 2021 at 8:18 AM houzj(dot)fnst(at)fujitsu(dot)com
> <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
> > Actually, I think if the (number of columns) * (number of rows) >
> > 65535, then we can get this error. But, I think almost no one will set
> > such a large value, so I think adjust the batch_size automatically when doing
> INSERT seems an acceptable solution.
> >
> > In the postgresGetForeignModifyBatchSize(), if we found the (num of
> > param) * batch_size Is greater than the limit(65535), then we set it to 65535 /
> (num of param).
> >
> > Thoughts ?
>
> +1 to limit batch_size for postgres_fdw and it's a good idea to have a
> macro for the max params.
>
> Few comments:

Thanks for the comments.

> 1) How about using macro in the error message, something like below?
> appendPQExpBuffer(errorMessage,
> libpq_gettext("number of parameters must be
> between 0 and %d\n"),
> PQ_MAX_PARAM_NUMBER);

Changed.

> 2) I think we can choose not mention the 65535 because it's hard to maintain if
> that's changed in libpq protocol sometime in future. How about "The final
> number of rows postgres_fdw inserts in a batch actually depends on the
> number of columns and the provided batch_size value.
> This is because of the limit the libpq protocol (which postgres_fdw uses to
> connect to a remote server) has on the number of query parameters that can
> be specified per query. For instance, if the number of columns * batch_size is
> more than the limit, then the libpq emits an error. But postgres_fdw adjusts the
> batch_size to avoid this error."
> instead of
> + overrides an option specified for the server. Note if the batch size
> + exceed the protocol limit (number of columns * batch_size > 65535),
> + then the actual batch size will be less than the specified batch_size.

Thanks, your description looks better. Changed.

> 3) I think "postgres_fdw should insert in each insert operation"
> doesn't hold after this patch. We can reword it to "postgres_fdw inserts in
> each insert operation".
> This option specifies the number of rows
> <filename>postgres_fdw</filename>
> should insert in each insert operation. It can be specified for a

Changed.

> 4) How about naming the macro as PQ_QUERY_PARAM_MAX_LIMIT?

Changed.

> 5) We can use macro in code comments as well.

Thanks, I reorganized the code comments.

> + * 65535, so set the batch_size to not exceed limit in a batch insert.
> 6) I think both code and docs patches can be combined into a single patch.
Combined.

Attaching V2 patch. Please consider it for further review.

Best regards,
houzj

Attachment Content-Type Size
v2-0001-limit-the-fdw-batch-size.patch application/octet-stream 4.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2021-05-21 08:00:43 Re: parallel vacuum - few questions on docs, comments and code
Previous Message Kyotaro Horiguchi 2021-05-21 07:49:24 Re: Race condition in recovery?