Re: Batch INSERT failing with error 22P02

From: Eric Faulhaber <ecf(at)goldencode(dot)com>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Batch INSERT failing with error 22P02
Date: 2007-09-20 16:11:40
Message-ID: 46F29BBC.20400@goldencode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-jdbc

Kris Jurka wrote:
>
>
> On Tue, 18 Sep 2007, Eric Faulhaber wrote:
>
>> Any idea what component is mixing up these types? As I understand it
>> from setting a breakpoint at the point the exception is thrown and
>> tracking back through the stack, the OID type (26) is reported by the
>> back-end and is compared with the type the driver expects for the
>> given parameter. But then I wonder why we would not see this error
>> with the same back-end and the older JDBC driver...
>
> Basically you have two text columns in your table (lot and
> serial_number) that alternate between backend type oids 1043 (varchar)
> and 26 (oid). This is most likely because hibernate sometimes calls
> setClob and sometimes it calls setString for these columns.
> Additionally you have something that is setting an unknown type in the
> driver that the server must resolve. This could be something like
> setNull(x, Types.OTHER), setObject(x, null), or by using the URL
> parameter stringtype=unspecified. An unspecified type is common when
> using timestamps, but you aren't using any in this table.

What in the log file is indicating the use of an unspecified type to
you? I'd like to investigate this further.

>
> So what's going on is, as you work through the rows you want to insert
> the types of these columns occasionally changes and a the driver must
> reparse a new server side prepared statement to handle the new type.
> Some time after the new server side statement has been setup we get an
> unspecified type and we ask the server to tell us what type we should
> really use. The server responds with a new type for something we
> don't think should have changed and we get an error.
>
> Consider this set of events:
>
> Server Has Current Data Action
> nothing yet 1043, 26 No server statement yet, parse
> 1043, 26 1043, 26 server matches, execute away
> 1043, 26 1043, 1043 server doesn't match, reparse
> 1043, 1043 1043, 0 0 matches, so don't reparse, but describe
>
> So we are passed an unknown type and we ask the server to resolve it.
> When we get the describe answer back we set it on the data,
> overwriting the 0 with the returned 1043. So that's all legal and
> expected behavior, but for some reason in your case.
>
> So one potential solution (and optimization) is to avoid the describe
> if we already have a prepared statement with type information filled
> in. We expect the backend to just echo back what the statement was
> prepared with, but what's the point when we have that already have
> that information in the driver? So this would likely avoid your
> problem, but it doesn't explain what's going on...
>
> I diffed the 8.1-408 release with 8.3dev for the core/v3 directory and
> the only relevant change is SimpleParameterList#bind. When deciding
> to overwrite an existing type we now add a check for NULL_OBJECT that
> 8.1 does not have:
>
> if (oid == Oid.UNSPECIFIED && paramTypes[index] != Oid.UNSPECIFIED &&
> value == NULL_OBJECT)
> return;
>
> I know I haven't described this well, especially for people who aren't
> versed in the frontend/backend protocol and how the driver issues
> queries, but I must admit I have no idea why this is failing. There
> is a potential workaround, but it doesn't explain the root cause.
> I'll keep investigating though...

Many thanks for a deep look into the log file and for a detailed
description of what is going on. Your analysis has given me some great
insight which has led to a possible solution (see below). Nevertheless,
I'd still be interested to find out what triggered this change in
behavior from version to version given my use case.

>
> 1) Are you using stringtype=unspecified in your URL?

No.

>
> 2) Any idea why these two columns are sometimes interpreted as strings
> and sometimes as Clobs? It doesn't look to be the case with the other
> text fields in your table.

Yes, this is caused by our implementation of custom Hibernate types. We
map standard SQL types to custom Java data type wrappers due to specific
runtime requirements. We leverage Hibernate's UserType facility for
this, which requires that we implement the code to serialize Java data
objects to and from JDBC via an instance of either PreparedStatement
(to) or ResultSet (from) handed to us by Hibernate at the appropriate times.

Note that our string columns are defined as text rather than character
varying(n). This is because we don't know how wide the string data we
will be expected to handle can be. Because the PG docs claim there is
no performance disadvantage to using text vs. character varying, we go
with text in all cases, thereby avoiding the headache of determining a
max width for varchar columns.

As a result, our custom user type for string data publishes its
corresponding JDBC type as Types.CLOB. This was a somewhat arbitrary
(and in retrospect, probably bad) decision, intended to match
Hibernate's PostgreSQLDialect's mapping of Types.CLOB to PG's text
type. In the code which serializes data to JDBC, we need to handle the
possibility of the data being unknown at runtime. If the data is known,
we use PreparedStatement.setString(index, "data"), which evidently uses
Types.VARCHAR. If unknown, we instead use
PreparedStatement.setNull(index, Types.CLOB).

This explains why we sometimes get strings and sometimes Clobs, but
where the unspecified type comes from, I have no idea. The only way
data is imported is via the mechanism described above: a
PreparedStatement INSERT which Hibernate generates, into which we set
positional parameters. For this table, we would use only the following
PreparedStatement methods:

setString(int, java.lang.String) --> for text data
setInt(int, int) --> for integer data
setBigDecimal(int, java.lang.BigDecimal) --> for numeric data
setDate(int, java.sql.Date) --> for date data
setNull(int, int) --> for unknown value, where the second parm is
Types.CLOB, Types.INTEGER, Types.NUMERIC, or Types.DATE, as appropriate

We never use setNull(x, Types.OTHER) nor setObject(x, null).

Interestingly, if I publish either VARCHAR or LONGVARCHAR as the JDBC
type associated with my custom user type for text, the problem no longer
occurs. This effectively changes the second parameter in the
PreparedStatement.setNull(int, int) call from CLOB to
VARCHAR/LONGVARCHAR. I'll have to re-run the full import over the
weekend to see if this has any repercussions with other tables. Also,
I'll have to do a lot of regression testing of our runtime to see if
this impacts anything else, as I'm not familiar enough with all the
places Hibernate may use this information.

>
> 3) The log file you sent me says: "Using 2 threads for import". Could
> you elaborate on what the different threads are doing? Any chance
> they're using the same Connection and somehow interfering with each
> other?

Each table is imported on a dedicated thread. So, despite the log
message, only one thread was running in this case, since only one table
was being imported. The second thread was never dispatched. Normally,
we import a large number of tables, so the distribution of work is more
efficient than in this case. So, no, Connection instances are never
shared across threads.

Thanks,
Eric Faulhaber

In response to

Responses

Browse pgsql-jdbc by date

  From Date Subject
Next Message Kris Jurka 2007-09-20 17:35:02 Re: Batch INSERT failing with error 22P02
Previous Message Albe Laurenz 2007-09-20 07:37:01 Re: Postgres 8.2 and setFetchSize