Skip site navigation (1) Skip section navigation (2)

Re: ODBC driver for Windows & future...

From: Marko Ristola <marko(dot)ristola(at)kolumbus(dot)fi>
To: pgsql-odbc(at)postgresql(dot)org
Subject: Re: ODBC driver for Windows & future...
Date: 2004-12-21 09:56:15
Message-ID: 41C7F33F.2070106@kolumbus.fi (view raw or flat)
Thread:
Lists: pgsql-odbc
Hi,

I have investigated the ODBC driver behaviour, when the query result has 
very many rows.

It seems, that the whole query result is stored as such into a memory 
buffer before any
further processing.
That buffer is reallocated, when needed. If the buffer is for example 
50Mbytes, and
after reallocation it's size will be 100Mbytes. The malloc() or 
realloc() takes a very long time.
I investigated this bottleneck on Linux ODBC driver.

The procedure without an ODBC cursor is as follows:
1. Read all query result data from the backend to the huge buffer.
    (maybe sometimes restructure the buffer, if some column size on the 
buffer is exceeded.)
    This seems to be the bottleneck with the large malloc() operation.
2. Read (and convert) the asked results from the buffer for the given row.


Allocating huge buffers is inefficient.
Linux operating system handles allocating big files much better
than allocating big memory areas.
More efficient would be to use a temporary file:
sequential file scans are rather fast.

One way for solving the problem:

Maybe the key for solving the bottleneck is to tune the operating system 
to free enough memory
beforehand: If the operating system has 100Mb unused memory, it is a lot 
faster,
than if it has only 2Mb unused memory ready for fast memory allocations.

Good way for solving the problem:

The bottleneck can be avoided on the program side by using ODBC cursor.
With ODBC cursor one can fetch for example 1000 rows in one batch
from the database server. You get next 1000 rows with a new fetch.
This way there is no limit on the number of rows fetched on any database.

On large result sets, there is always a limit with the memory on 32 bit 
systems.
On 64 bit systems this limit goes away, but the limit with some slowdown on
nonlocal CPU memory won't go away even on high end machines.
(NUMA machines have about 2Gbytes memory near each CPU. Other memory is 
behind
a slower bus ). So memory allocations over 2Gb are not good for speed.

So, the ODBC cursor scales well for any huge query result, on any
client operating system. It works even on Java, where memory is 
extremely limited.

Other ways to solve the problem?
ODBC Code: How about  allocating memory in 4Mb chunks? The operating 
system handles
small memory allocations more easily and frees more memory to be available
in the background while the ODBC driver fills the allocated chunk.

Marko Ristola

*Shachar Shemesh wrote:

*lso of interest is that this very same client is also interested in the 
ODBC driver for a different project. We have already did some porting of 
their application, and have spotted a serious performance issue with 
ODBC when long query results are retrieved. It is possible (thought it 
would be best not to count on it) that we will do some work in that 
direction on ODBC in the foreseeable future. The reason we did not step 
forward and offered ourselves as full maintainers of the code is that we 
don't feel we have the resources for that. It is good to know, however, 
that the facilities for sending patches and having them committed exists.


Responses

pgsql-odbc by date

Next:From: markwDate: 2004-12-21 14:27:20
Subject: Re: ODBC driver for Windows & future...
Previous:From: Bruce MomjianDate: 2004-12-21 01:25:37
Subject: Re: memory leak in ODBC driver

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group