Thanks for taking the time to look into the problem. It is much appreciated.
Tom Lane wrote:
> Michael Akinde <michael(dot)akinde(at)met(dot)no> writes:
>> I use the following script to create a test table. For
>> /tmp/oidfile.temp, I use "FAQ_farsi" from the PostgreSQL installation
>> (about 140kb).
> Okay, I ran this with about 900MB of shared buffers (about as much as I
> thought I could make it without descending into swap hell ...) and there
> is no memory leak that I can see. What I *do* see is that the process
> size as reported by "top" quickly jumps to 900MB plus and then sits
> there. This is not a memory leak though, it is just a side effect of
> the way "top" reports usage of shared memory. Basically, a shared
> buffer starts getting charged against a given process the first time
> that process touches that buffer. Your test case involves reading a lot
> of blocks of pg_largeobject and that results in touching a lot of
Hmm. Just some questions to clarify for our benefit, if you can find the
Why does it make a difference to lo_open what the size of the blob is?
Other than simply opening the blob to get the file descriptor, after
all, we don't touch the blob itself.
Also, since the blob is opened and closed, why does the process allocate
new memory to open a new blob, rather than reuse existing memory? If
this is the intended behavior (as it seems), is there someway we could
force lo_open to reuse the memory (as this would seem to be a desirable
behavior, at least to us)?
> So basically I don't see a problem here. If you are noticing a
> performance issue in this area, it may indicate that you have
> shared_buffers set too large, ie, using more RAM than the machine
> can really afford to spare. That leads to swapping which drives
> performance down.
I suppose this goes for another mailing list, but 2GB shared buffers on
a 16GB server which isn't used for anything else doesn't seem like that
much. Is there an up to date source containing a thorough discussion of
these settings (as everything we've found seems to be from 2003)?
The case worries us a bit for two reasons.
Firstly, we expect both much bigger retrieval queries in production (1
million rows, rather than 100 thousand) , and we've already seen that
the database will max out physical memory usage at around 14 GB (shared
memory usage is still reported at 2GB) and allocate huge globs of
virtual memory (~30 GB) for queries of this kind. Some part of that
memory usage is of course caused by our code (which also does retrieval
on many other tables), but the huge majority of that memory usage
appears to be caused by lo_open. Secondly, we will see exponential
growth in the size of the blobs over the lifetime of the system, which
bodes fairly poorly for this approach to a solution - memory is getting
cheaper, but not that fast.
Asking the users to "please don't ask big queries" is obviously not the
kind of option which is tenable in the long run, so if we can't "tune"
the problem away, and the large object facility is working as designed,
then we'll need to figure out a different solution for the system.
We had been considering toasted ByteAs, but as I understand it, they
don't support random I/O, which would seem to rule them out as a
practical alternative. Do you think it would be practicable to fix the
Postgres large object implementation so that it is more suitable for our
needs, or would it be simpler to do our own implementation of blobs in
Database Architect, met.no
In response to
pgsql-bugs by date
|Next:||From: Michael Akinde||Date: 2008-01-22 11:27:55|
|Subject: Re: BUG #3881: lo_open leaks memory|
|Previous:||From: hubert depesz lubaczewski||Date: 2008-01-22 08:59:28|
|Subject: bad message or bad privilege check in foreign key constraint|