From: | Vitalii Tymchyshyn <vit(at)tym(dot)im> |
---|---|
To: | Craig Ringer <craig(at)2ndquadrant(dot)com> |
Cc: | PG-JDBC Mailing List <pgsql-jdbc(at)postgresql(dot)org> |
Subject: | Re: Disk buffering of resultsets |
Date: | 2014-10-06 02:22:03 |
Message-ID: | CABWW-d3jpnTLeY3mrRJtNW+2NNN21kPh=rXDXowOkB5qYMyVfw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-jdbc |
Hello.
As of routing I meant exactly attaching to original statement result
belongs to and throwing an error as soon as someone get to correct point
calling this original statement.
As of threading I tend to agree with you. It can be revised in Java 8 where
there is a default executor that can be used.
The primary problem as I can see it are OOMs on large result sets.
Currently the workaround is to use fetchSize, but it works only in very
limited scenarios.
I can see two ways here: improve fetchSize handling (potentially setting
default fetchSize to some value like 10000) or storing result sets out of
heap.
One more thing to remember is time to get first row. It would be great to
have first fast enough without reading, parsing and storing somewhere all
the 1000000 rows.
Best regards, Vitalii Tymchyshyn
2014-10-05 21:53 GMT-04:00 Craig Ringer <craig(at)2ndquadrant(dot)com>:
> On 10/05/2014 03:16 AM, Vitalii Tymchyshyn wrote:
> > Well, the exception in this case should be "routed" to the statement
> > that run the problematic query. Next one should get something only if
> > connection became ususable as a result.
>
> You can't do that, the logic flow and timing are all wrong.
>
> The opportunity to throw an exception at the right place is gone and
> past by the time you get to this point. You can't go back in the code
> and throw an exception at some prior point of execution.
>
> If the statement still exists you can attach the exception that would be
> thrown to it such that you throw it next time somebody calls a method on
> that statement or its result set though.
>
> >> It also
> >> makes no sense, as when you execute a new statement, the resultset of
> >> the prior statement is automatically closed.
> >
> > Do they? I think they are closed only for the same statement object.
> > Different statement may try to reuse the connection.
>
> Yay, specification reading time.
>
> https://jcp.org/aboutJava/communityprocess/final/jsr221/
>
> The JDBC spec is one of the less awful Java specs, thankfully. (Whatever
> you do, do not attempt to read the JSF2 specification).
>
> Short version: you're right, you can have multiple open statements, each
> with a valid open resultset. The JDBC implementation is free to decide
> how it does this based on the capabilities of the database.
>
> We're allowed to close all statements and result sets at commit time,
> either implicit autocommit or explicit commit. If a user wants to keep a
> resultset past that time they must set it as a holdable resultset with
> the HOLD_CURSORS_OVER_COMMIT flag.
>
>
>
>
>
>
> Detail of relevant spec sections:
>
> 13.1.1 "Creating statements":
>
> > Each Connection object can create multiple Statement objects that may
> > be used concurrently by the program.
>
> 13.1.4 (pdf page 117)
>
> > An application calls the method Statement.close to indicate that it has
> finished
> > processing a statement. All Statement objects will be closed when the
> connection
> > that created them is closed. [...]
> >
> > Closing a Statement object will close and invalidate any instances of
> ResultSet
> > produced by that Statement object. [...]
>
> and 15.1 "Result Sets", particularly 15.1.3 "Resultset Holdablity"
>
> > Calling the method Connection.commit can close the ResultSet objects >
> that have been created during the current transaction.
>
> and 15.2.5 "Closing a resultset object":
>
> A ResultSet object is explicitly closed when
> * The close method on the ResultSet is executed, thereby releasing any
> external resources
> * The Statement or Connection object that produced the ResultSet is
> explictly closed
> A ResultSet object is implicitly closed when
> * The associated Statement object is re-executed
> * The ResultSet is created with a Holdability of
> CLOSE_CURSORS_AT_COMMIT and an implicit or explicit commit occurs
>
> Also, note that PgJDBC declares that:
>
> "For Select statements, the statement is complete when the associated
> result set is closed."
>
> so in autocommit we're allowed to keep a transaction open with a cursor
> streaming results until the resultset is closed.
>
>
> >> > - do it in background (a little conflicting to postponing, but more
> >> > thinking is needed). Return first fetchSize rows and start copying
> >> > network to disk in background thread.
> >>
> >> You don't want to go there. It's horribly complicated to work with
> >> background threads portably in the JDBC driver. For an example, see the
> >> recent discussion of Timer handling.
> >
> > But we've already got separate threads. Why can't we have some more?
>
> The threading we're already doing is causing issues.
>
> Threading is very different in Java SE and Java EE environments. Lots of
> care is required to cope with driver unloads/reloads, avoiding
> classloader leaks, etc.
>
> >> I'm pretty sure this is already possible in PgJDBC (when you set a fetch
> >> size) though I'd need to write a test case and do some code reading to
> >> be totally sure.
> >
> > As far as I understand it's not, and by implementing this we could solve
> > a lot of issues for large result sets.
>
> I think we might have a different idea of what "this" is.
>
> Perhaps it would be helpful if you described the underlying problem
> you're trying to solve? We've kind of started at the solution, without
> really defining the problem the solution is for.
>
> From what I can tell I think you might be trying to make holdable
> resultsets in autocommit mode more efficient by implementing lazy
> resultset fetching.
>
> --
> Craig Ringer http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>
From | Date | Subject | |
---|---|---|---|
Next Message | Craig Ringer | 2014-10-06 02:42:36 | Re: Disk buffering of resultsets |
Previous Message | Craig Ringer | 2014-10-06 01:53:31 | Re: Disk buffering of resultsets |