Re: OutOfMemory hibernate scroll with 2M records | Postgresql 8.4 DB

From: Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: Ankit Kumar <ankitk(at)xebia(dot)com>, pgsql-bugs(at)postgresql(dot)org
Subject: Re: OutOfMemory hibernate scroll with 2M records | Postgresql 8.4 DB
Date: 2009-12-21 18:35:02
Message-ID: 4B2FBFD6.6070007@postnewspapers.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 22/12/2009 1:25 AM, Greg Stark wrote:
> On Mon, Dec 21, 2009 at 5:04 PM, Ankit Kumar<ankitk(at)xebia(dot)com> wrote:
>> Thanks for your response. Hibernate works well when I change the DB to SQL
>> server but somehow the moment I point to Postgresql it start generating
>> OutOfMemory. Is there some configuration at DB end to ensure it starts using
>> the cursors.
>
> No Postgresql is not generating these errors -- they're Java errors
> and Postgresql is not written in Java.

However, PostgreSQL's JDBC driver *is*.

OP: I suggest following this up on the pgsql-jdbc list, where you're
likely to get people who work with these tools regularly and may be
better equipped to advise you.

> The Hibernate people -- I don't know
> what their mailing list is

They don't have one. There are forums at http://forums.hibernate.com/
but don't expect much. It's not a particularly helpful or friendly
community - lots of people asking questions, few people helping out or
answering them. I blame painful to use, crappy web forums.

You could always try http://stackoverflow.com/ .

> -- can answer questions about what
> Hibernate does differently for MSSQL versus Postgres.

In general it tries to behave almost exactly the same between different
DBs. Java folks (as a broad group) seem to be big believers in "the
database is a commodity product, they should all look and behave exactly
the same"*. If they can avoid admitting to a difference between two
databases, they will.

There are a few differences in SQL dialect handling, but overall
behaviour is VERY similar. It expects the JDBC driver to do the work of
making the database look like the standard JDBC interface except for SQL
dialect quirks. This usually even works.

Hibernate, like Pg, expects the session (transaction) to be aborted and
re-tried from the start if things go wrong, and is generally a good
conceptual fit for how Pg does things. So it generally gets away with
this attitude.

Pg's JDBC driver isn't exactly fully to spec, though, so there are
issues where Hibernate gets different behaviour than the spec requires
and chokes on it. That's not a *bug* in the Pg JDBC driver per se as it
doesn't claim to be fully JDBC3/4 compliant, but it's problematic
nonetheless.

* Different features and trade-off choices for different needs, you say?
Never! Let's code to the lowest common denominator and do all that
complicated "join" stuff in Java by fetching the raw tables then
filtering and processing them client-side. Hmm, actually, I'm sure I
have my nice, standard file open/seek/close APIs close to hand, I should
use throw out all this unnecessarily complicated "relational database"
stuff that I have to fight so hard to get to store my data in the
dumbest way possible - I'll just build an adapter in a java midlayer to
do what I need.

(sigh). Thankfully Hibernate isn't this dumb, but lots of ORMs try very
hard to be. Even with Hiberante, attempts to actually (gasp) use a
database's features are often met with "that's not portable between
databases, so you shouldn't want to do it" or dark mutterings about
"legacy", which is the Java word for "not completely universal (yet),"
"innovative and interesting" or "a solution to a problem that wasn't
written in Java".

Bitter? Me? No, really?

> If they say it's
> doing the same thing then once you're clear on what that is that it's
> doing you can speak to the JDBC Postgresql driver people
> (pgsql-jdbc(at)postgresql(dot)org) and ask whether they handle that case
> poorly. But if you start with the jdbc people and you can't answer
> questions about what Hibernate is doing then they're not going to know
> how to help (unless they've seen it before I guess).

In this case, before posting to the pgsql-jdbc list you should turn on
'log_statement = all' in the Pg server config, and enable debug-level
logging of the whole 'org.hibernate' tree in your slf4j (via log4j or
whatever) config.

Then re-test. Record what SQL is actually being sent to the server (as
seen in the server logs) and what Hibernate is doing as per its logs.
Upload those logs to a suitable pastebin site or gzip them and post them
somewhere people can get to them - do *NOT* attach them to a message to
the mailing list.

Finally, make up a COMPLETELY SELF CONTAINED TEST CASE - with ".sql"
database dump or creation script, plus complete and compilable
stand-alone Java code to demonstrate the problem. If you can't hand out
your data, write something that generates dummy data that'll reproduce
the problem.

Read the logs, take note of any bits that look informative, and show
them as excerpts in your message.

*then* post on the pgsql-jdbc list and see if anyone can offer advice.

--
Craig Ringer

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Robert Haas 2009-12-22 03:13:02 Re: BUG #5250: Tutorial examples(pre-compiled) not present with precompiled binary version of PostgreSQL.
Previous Message Greg Stark 2009-12-21 17:25:41 Re: OutOfMemory hibernate scroll with 2M records | Postgresql 8.4 DB