Skip site navigation (1) Skip section navigation (2)

Re: [JDBC] BUG #1347: Bulk Import stopps after a while ( 8.0.0.

From: Oliver Jowett <oliver(at)opencloud(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Kris Jurka <books(at)ejurka(dot)com>,Bahadur Singh <bahadursingh(at)yahoo(dot)com>, pgsql-bugs(at)postgresql(dot)org,pgsql-jdbc(at)postgresql(dot)org
Subject: Re: [JDBC] BUG #1347: Bulk Import stopps after a while ( 8.0.0.
Date: 2004-12-13 20:58:16
Message-ID: 41BE0268.1060602@opencloud.com (view raw or flat)
Thread:
Lists: pgsql-bugspgsql-jdbc
Tom Lane wrote:
> Kris Jurka <books(at)ejurka(dot)com> writes:
> 
>>    // To avoid this, we guess at how many queries we can send before the 
>>    // server -> driver stream's buffer is full (MAX_BUFFERED_QUERIES). 
> 
> 
> It seems essentially impossible for the driver to do this reliably,
> since it has no clue how much data any one query will return.

Right, but I'm not convinced that this is the problem here as batch 
execution in JDBC is only allowed to do non-result-set-returning 
queries. The only case I can think of where this would break is if 
something is causing lots of logging output to the client (triggers etc.).

> How about instead thinking in terms of not filling the driver->server
> channel?  You have a good handle on how much data you have sent (or at
> least you could keep track of that), and if you bound it to 64K or so
> then you should be safe.  Perhaps the limit ought to be easily
> configurable just in case, but at least you'd be measuring something
> measurable.

That's possibly a better idea but it does mean that we wouldn't be able 
to batch inserts that contain lots of data. That's the use case I needed 
to support when I wrote this in the first place..

Also, it's never going to be 100% without a separate thread, as the 
server can spontaneously generate output (e.g. because of NOTIFY) 
regardless of how careful we are with our queries.

There's actually another problem with this code: the subdivision into 
smaller batches is not transparent if autocommit is on. We send a Sync 
at the end of the batch which will cause an implicit commit. We should 
be sending a Flush, but it's harder for the driver to handle this as a 
Flush does not provoke a response message from the server, so we would 
have to track the protocol state more closely. Given that JDBC is silent 
about the autocommit semantics of batch execution anyway, I'm not too 
worried about fixing this urgently.

I'd like to see that this is really the problem before tweaking this 
code. Given that the OP said that batch sizes of 1000-2000 worked OK, 
I'm not sure that this code is the problem since the maximum number of 
queries we'll send per batch is around 250 by default.

-O

In response to

Responses

pgsql-bugs by date

Next:From: Oliver JowettDate: 2004-12-13 21:13:57
Subject: Re: [JDBC] BUG #1347: Bulk Import stopps after a while (
Previous:From: Bryan RobertsDate: 2004-12-13 20:26:21
Subject: Re: Win 32 'could not attach to proper memory at fixed address'

pgsql-jdbc by date

Next:From: Oliver JowettDate: 2004-12-13 21:13:57
Subject: Re: [JDBC] BUG #1347: Bulk Import stopps after a while (
Previous:From: Ricardo Vaz MannrichDate: 2004-12-13 19:39:45
Subject: Interval type

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group