Skip site navigation (1) Skip section navigation (2)

Re: Batch INSERT failing with error 22P02

From: Eric Faulhaber <ecf(at)goldencode(dot)com>
To: Kris Jurka <books(at)ejurka(dot)com>
Cc: pgsql-jdbc(at)postgresql(dot)org
Subject: Re: Batch INSERT failing with error 22P02
Date: 2007-09-20 16:11:40
Message-ID: 46F29BBC.20400@goldencode.com (view raw or flat)
Thread:
Lists: pgsql-jdbc
Kris Jurka wrote:
>
>
> On Tue, 18 Sep 2007, Eric Faulhaber wrote:
>
>> Any idea what component is mixing up these types?  As I understand it 
>> from setting a breakpoint at the point the exception is thrown and 
>> tracking back through the stack, the OID type (26) is reported by the 
>> back-end and is compared with the type the driver expects for the 
>> given parameter.  But then I wonder why we would not see this error 
>> with the same back-end and the older JDBC driver...
>
> Basically you have two text columns in your table (lot and 
> serial_number) that alternate between backend type oids 1043 (varchar) 
> and 26 (oid). This is most likely because hibernate sometimes calls 
> setClob and sometimes it calls setString for these columns.  
> Additionally you have something that is setting an unknown type in the 
> driver that the server must resolve.  This could be something like 
> setNull(x, Types.OTHER), setObject(x, null), or by using the URL 
> parameter stringtype=unspecified. An unspecified type is common when 
> using timestamps, but you aren't using any in this table.

What in the log file is indicating the use of an unspecified type to 
you?  I'd like to investigate this further.

>
> So what's going on is, as you work through the rows you want to insert 
> the types of these columns occasionally changes and a the driver must 
> reparse a new server side prepared statement to handle the new type.  
> Some time after the new server side statement has been setup we get an 
> unspecified type and we ask the server to tell us what type we should 
> really use.  The server responds with a new type for something we 
> don't think should have changed and we get an error.
>
> Consider this set of events:
>
> Server Has    Current Data    Action
> nothing yet    1043, 26    No server statement yet, parse
> 1043, 26    1043, 26    server matches, execute away
> 1043, 26    1043, 1043    server doesn't match, reparse
> 1043, 1043    1043, 0        0 matches, so don't reparse, but describe
>
> So we are passed an unknown type and we ask the server to resolve it. 
> When we get the describe answer back we set it on the data, 
> overwriting the 0 with the returned 1043.  So that's all legal and 
> expected behavior, but for some reason in your case.
>
> So one potential solution (and optimization) is to avoid the describe 
> if we already have a prepared statement with type information filled 
> in.  We expect the backend to just echo back what the statement was 
> prepared with, but what's the point when we have that already have 
> that information in the driver?  So this would likely avoid your 
> problem, but it doesn't explain what's going on...
>
> I diffed the 8.1-408 release with 8.3dev for the core/v3 directory and 
> the only relevant change is SimpleParameterList#bind.  When deciding 
> to overwrite an existing type we now add a check for NULL_OBJECT that 
> 8.1 does not have:
>
> if (oid == Oid.UNSPECIFIED && paramTypes[index] != Oid.UNSPECIFIED && 
> value == NULL_OBJECT)
>             return;
>
> I know I haven't described this well, especially for people who aren't 
> versed in the frontend/backend protocol and how the driver issues 
> queries, but I must admit I have no idea why this is failing.  There 
> is a potential workaround, but it doesn't explain the root cause.  
> I'll keep investigating though...

Many thanks for a deep look into the log file and for a detailed 
description of what is going on.  Your analysis has given me some great 
insight which has led to a possible solution (see below).  Nevertheless, 
I'd still be interested to find out what triggered this change in 
behavior from version to version given my use case.

>
> 1) Are you using stringtype=unspecified in your URL?

No.

>
> 2) Any idea why these two columns are sometimes interpreted as strings 
> and sometimes as Clobs?  It doesn't look to be the case with the other 
> text fields in your table.

Yes, this is caused by our implementation of custom Hibernate types.  We 
map standard SQL types to custom Java data type wrappers due to specific 
runtime requirements.  We leverage Hibernate's UserType facility for 
this, which requires that we implement the code to serialize Java data 
objects to and from JDBC via an instance of either PreparedStatement 
(to) or ResultSet (from) handed to us by Hibernate at the appropriate times.

Note that our string columns are defined as text rather than character 
varying(n).  This is because we don't know how wide the string data we 
will be expected to handle can be.  Because the PG docs claim there is 
no performance disadvantage to using text vs. character varying, we go 
with text in all cases, thereby avoiding the headache of determining a 
max width for varchar columns.

As a result, our custom user type for string data publishes its 
corresponding JDBC type as Types.CLOB.  This was a somewhat arbitrary 
(and in retrospect, probably bad) decision, intended to match 
Hibernate's PostgreSQLDialect's mapping of Types.CLOB to PG's text 
type.  In the code which serializes data to JDBC, we need to handle the 
possibility of the data being unknown at runtime.  If the data is known, 
we use PreparedStatement.setString(index, "data"), which evidently uses 
Types.VARCHAR.  If unknown, we instead use 
PreparedStatement.setNull(index, Types.CLOB). 

This explains why we sometimes get strings and sometimes Clobs, but 
where the unspecified type comes from, I have no idea.  The only way 
data is imported is via the mechanism described above:  a 
PreparedStatement INSERT which Hibernate generates, into which we set 
positional parameters.  For this table, we would use only the following 
PreparedStatement methods:

setString(int, java.lang.String) --> for text data
setInt(int, int) --> for integer data
setBigDecimal(int, java.lang.BigDecimal) --> for numeric data
setDate(int, java.sql.Date) --> for date data
setNull(int, int) --> for unknown value, where the second parm is 
Types.CLOB, Types.INTEGER, Types.NUMERIC, or Types.DATE, as appropriate

We never use setNull(x, Types.OTHER) nor setObject(x, null).

Interestingly, if I publish either VARCHAR or LONGVARCHAR as the JDBC 
type associated with my custom user type for text, the problem no longer 
occurs.  This effectively changes the second parameter in the 
PreparedStatement.setNull(int, int) call from CLOB to 
VARCHAR/LONGVARCHAR.  I'll have to re-run the full import over the 
weekend to see if this has any repercussions with other tables.  Also, 
I'll have to do a lot of regression testing of our runtime to see if 
this impacts anything else, as I'm not familiar enough with all the 
places Hibernate may use this information.

>
> 3) The log file you sent me says: "Using 2 threads for import".  Could 
> you elaborate on what the different threads are doing?  Any chance 
> they're using the same Connection and somehow interfering with each 
> other?

Each table is imported on a dedicated thread.  So, despite the log 
message, only one thread was running in this case, since only one table 
was being imported.  The second thread was never dispatched.  Normally, 
we import a large number of tables, so the distribution of work is more 
efficient than in this case.  So, no, Connection instances are never 
shared across threads.

Thanks,
Eric Faulhaber



In response to

Responses

pgsql-jdbc by date

Next:From: Kris JurkaDate: 2007-09-20 17:35:02
Subject: Re: Batch INSERT failing with error 22P02
Previous:From: Albe LaurenzDate: 2007-09-20 07:37:01
Subject: Re: Postgres 8.2 and setFetchSize

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group