Re: ODBC driver for Windows & future...

From: Marko Ristola <marko(dot)ristola(at)kolumbus(dot)fi>
To: pgsql-odbc(at)postgresql(dot)org
Subject: Re: ODBC driver for Windows & future...
Date: 2004-12-21 09:56:15
Message-ID: 41C7F33F.2070106@kolumbus.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-odbc


Hi,

I have investigated the ODBC driver behaviour, when the query result has
very many rows.

It seems, that the whole query result is stored as such into a memory
buffer before any
further processing.
That buffer is reallocated, when needed. If the buffer is for example
50Mbytes, and
after reallocation it's size will be 100Mbytes. The malloc() or
realloc() takes a very long time.
I investigated this bottleneck on Linux ODBC driver.

The procedure without an ODBC cursor is as follows:
1. Read all query result data from the backend to the huge buffer.
(maybe sometimes restructure the buffer, if some column size on the
buffer is exceeded.)
This seems to be the bottleneck with the large malloc() operation.
2. Read (and convert) the asked results from the buffer for the given row.

Allocating huge buffers is inefficient.
Linux operating system handles allocating big files much better
than allocating big memory areas.
More efficient would be to use a temporary file:
sequential file scans are rather fast.

One way for solving the problem:

Maybe the key for solving the bottleneck is to tune the operating system
to free enough memory
beforehand: If the operating system has 100Mb unused memory, it is a lot
faster,
than if it has only 2Mb unused memory ready for fast memory allocations.

Good way for solving the problem:

The bottleneck can be avoided on the program side by using ODBC cursor.
With ODBC cursor one can fetch for example 1000 rows in one batch
from the database server. You get next 1000 rows with a new fetch.
This way there is no limit on the number of rows fetched on any database.

On large result sets, there is always a limit with the memory on 32 bit
systems.
On 64 bit systems this limit goes away, but the limit with some slowdown on
nonlocal CPU memory won't go away even on high end machines.
(NUMA machines have about 2Gbytes memory near each CPU. Other memory is
behind
a slower bus ). So memory allocations over 2Gb are not good for speed.

So, the ODBC cursor scales well for any huge query result, on any
client operating system. It works even on Java, where memory is
extremely limited.

Other ways to solve the problem?
ODBC Code: How about allocating memory in 4Mb chunks? The operating
system handles
small memory allocations more easily and frees more memory to be available
in the background while the ODBC driver fills the allocated chunk.

Marko Ristola

*Shachar Shemesh wrote:

*lso of interest is that this very same client is also interested in the
ODBC driver for a different project. We have already did some porting of
their application, and have spotted a serious performance issue with
ODBC when long query results are retrieved. It is possible (thought it
would be best not to count on it) that we will do some work in that
direction on ODBC in the foreseeable future. The reason we did not step
forward and offered ourselves as full maintainers of the code is that we
don't feel we have the resources for that. It is good to know, however,
that the facilities for sending patches and having them committed exists.

Responses

Browse pgsql-odbc by date

  From Date Subject
Next Message markw 2004-12-21 14:27:20 Re: ODBC driver for Windows & future...
Previous Message Bruce Momjian 2004-12-21 01:25:37 Re: memory leak in ODBC driver