Re: Proposal to fix Statement.executeBatch()

From: Barry Lind <barry(at)xythos(dot)com>
To: Rene Pijlman <rpijlman(at)wanadoo(dot)nl>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Proposal to fix Statement.executeBatch()
Date: 2001-08-27 18:07:55
Message-ID: 3B8A8C7B.1080106@xythos.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Rene,

I see your statements below as incorrect:

> The intended behaviour is to send a set of update/insert/delete/DDL
> statements in one round trip to the database. Unfortunately,
> this optional JDBC feature cannot be implemented correctly with
> PostgreSQL, since the backend only returns the update count of
> the last statement send in one call with multiple statements.
> JDBC requires it to return an array with the update counts of
> all statements in the batch.

The intended behaviour is certainly to send all of the statements in one
round trip. And the JDBC2.1 spec certainly allows postgres to do just
that. Here is how I would suggest this be done in a way that is spec
compliant (Note: that I haven't looked at the patch you submited yet, so
forgive me if you have already done it this way, but based on your
comments in this email, my guess is that you have not).

Statements should be batched together in a single statement with
semicolons separating the individual statements (this will allow the
backend to process them all in one round trip).

The result array should return an element with the row count for each
statement, however the value for all but the last statement will be
'-2'. (-2 is defined by the spec to mean the statement was processed
successfully but the number of affected rows is unknown).

In the event of an error, then the driver should return an array the
size of the submitted batch with values of -3 for all elements. -3 is
defined by the spec as the corresponding statement failed to execute
successfully, or for statements that could not be processed for some
reason. Since in postgres when one statement fails (in non-autocommit
mode), the entire transaction is aborted this is consistent with a
return value of -3 in my reading of the spec.

I believe this approach makes the most sense because:
1) It implements batches in one round trip (the intention of the feature)
2) It is complient with the standard
3) It is complient with the current functionality of the backend

thanks,
--Barry

Rene Pijlman wrote:
> I've finished the secion on batch updates in the JDBC 2.0
> compliance documentation on
> http://lab.applinet.nl/postgresql-jdbc/ (see the quote of the
> relevant part below).
>
> In the short term I think two things need to be fixed:
> 1) don't begin, commit or rollback a transaction implicitly in
> Statement.executeBatch()
> 2) have executeBatch() throw a BatchUpdateException when it is
> required to do so by the JDBC spec
>
> If there are no objections from this list I intend to submit a
> patch that fixes 1), and perhaps also 2).
>
> Note that this may cause backward compatibility issues with JDBC
> applications that have come to rely on the incorrect behaviour.
> OTOH, there have been complaints on this list before, and those
> people would certainly be happy about the fix. E.g.
> http://fts.postgresql.org/db/mw/msg.html?mid=83832
>
> In the long run it would be nice if the backend would support
> returning one update count (and perhaps an OID) per statement
> send in a semicolon separated multi-statement call. Would this
> be something for the backend TODO list? OTOH, I'm not sure if
> this (small?) performance improvement is worth the trouble.
>
> "Batch updates
>
> The driver supports batch updates with the addBatch, clearBatch
> and executeBatch methods of Statement, PreparedStatement and
> CallableStatement. DatabaseMetaData.supportsBatchUpdates()
> returns true.
>
> However, executing statements in a batch does not provide a
> performance improvement with PostgreSQL, since all statements
> are internally send to the backend and processed one-by-one.
> That defeats the purpose of the batch methods. The intended
> behaviour is to send a set of update/insert/delete/DDL
> statements in one round trip to the database. Unfortunately,
> this optional JDBC feature cannot be implemented correctly with
> PostgreSQL, since the backend only returns the update count of
> the last statement send in one call with multiple statements.
> JDBC requires it to return an array with the update counts of
> all statements in the batch. Even though the batch processing
> feature currently provides no performance improvement, it should
> not be removed from the driver for reasons of backward
> compatibility.
>
> The current implementation of Statement.executeBatch() in
> PostgreSQL starts a new transaction and commits or aborts it.
> This is not in compliance with the JDBC specification, which
> does not mention transactions in the description of
> Statement.executeBatch() at all. The confusion is probably
> caused by a JDBC tutorial from Sun with example code which
> disables autocommit before calling executeBatch "so that the
> transaction will not be automatically committed or rolled back
> when the method executeBatch is called". This comment in the
> tutorials appears to be a misunderstanding. A good reason to
> disable autocommit before calling executeUpdate() is to be able
> to commit or rollback all statements in a batch as a unit. With
> autocommit enabled, the application would not know which
> statements had and had not been processed when an exception is
> thrown. It is the responsibility of the application, however, to
> disable autocommit and to commit or rollback a transaction. Note
> that Oracle's implementation of executeBatch() also does not
> commit or rollback a transaction implicitly. The implementation
> of Statement.executeBatch() in PostgreSQL should be changed to
> not begin, commit or rollback a transaction.
>
> Support for BatchUpdateException is not yet implemented. The
> implementation of executeBatch is incorrect, therefore, since it
> is required to throw a BatchUpdateException if one of the
> commands in the batch returns something other than an update
> count."
>
> Regards,
> René Pijlman
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>
>

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Barry Lind 2001-08-27 18:23:08 Re: Re: [BUGS] Bug #428: Another security issue with the JDBC driver.
Previous Message Rene Pijlman 2001-08-27 18:01:18 Re: JDBC changes for 7.2 - wish list item