Re: plperl doesn't release memory

From: Dan Sugalski <dan(at)sidhe(dot)org>
To: "GIROIRE Nicolas (COFRAMI)" <nicolas(dot)giroire(at)airbus(dot)com>, Postgresql-General list <pgsql-general(at)postgresql(dot)org>
Subject: Re: plperl doesn't release memory
Date: 2005-03-31 14:57:07
Message-ID: a06210200be71b8149204@[172.24.18.155]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

At 8:38 AM +0200 3/31/05, GIROIRE Nicolas (COFRAMI) wrote:
>Hi,
>I work with William.
>
>In fact, we have already done the procedure in
>pl/pgsql but it is too slow and we use array
>which are native in perl.
>The procedure is recursive, and use request on postgreSQL.
>According to the evolution of memory use, it
>seems that no memory is free. I think that comes
>from the fact we have a recursive procedure.
>
>The execution of the procedure take 3 hours and
>finishes already by an out of memory.
>
>Can we oblige pl/perl to free memory for variable ?
>Or can we configure postgresql to accept this rise in load ?
>Or another idea ?

Perl generally frees things up as soon as they're
no longer used, but there are a few cases where
you'll run into trouble.

The first is the circular reference problem --
because perl uses reference counting, circular
data structures won't ever die on their own,
something you'll need to watch out for.

You need to make sure the variables actually go
out of scope. With a recursive procedure this is
a definite worry, since perl cleans up when
variables go out of scope, and that doesn't
happen until a sub actually exits.

Perl also does some optimistic caching as a
performance booster, which is generally a win but
sometimes it isn't. While perl cleans up the
contents of variables, it leaves the structure in
place for arrays and hashes for subs. (Though
only once for each sub, so this doesn't get nuts
for recursive invocations of a subroutine) Not
normally a problem, but if you've got a 100M
element array the bits add up.

Finally, make sure you're using a relatively
recent perl, one of the 5.8 versions. There were
some bugs relating to closures that got patched
up -- earlier versions had some reference count
issues there so closures and their contents
tended not to ever get cleaned up.

*Assuming* you're not actually leaking data with
circular structures and the like, or throwing
massive amounts of data into globals, there are a
few things you can do to keep your memory usage
in line.

1) Do *not* pass in large arrays or hashes as
parameters. Use references to them instead, to
avoid perl's parameter flattening
2) Kill your data yourself when you're done with
it by undef()ing the variables. (Do *not* assign
in empty lists, or empty strings. That isn't
enough) "undef @foo", for example, will
completely clean out the @foo array, and leave
you with a variable that only takes up 56 bytes
or so.
3) Try and keep the number of hash keys you use
relatively low. (Not normally an issue, but once
you start getting into millions of entries it
adds up) Perl makes individual small allocations
for hash keys and it tends to fragment the free
list.

It might be worth a code review to see if you're
doing things that are inefficient in general.
That tends to be an issue when working with large
data sets, since inefficiencies that don't matter
with 100 (or 100K) records becomes an issue when
you get into massive data sets.

You can also do some memory usage investigation
with Devel::Size and some of the other Devel
modules. (Though be warned that Devel::Size is
pretty profligate itself with memory)

>-----Message d'origine-----
>De : pgsql-general-owner(at)postgresql(dot)org
>[<mailto:pgsql-general-owner(at)postgresql(dot)org>mailto:pgsql-general-owner(at)postgresql(dot)org]De
>la part de Sean Davis
>Envoyé : mercredi 30 mars 2005 17:01
>À : FERREIRA William (COFRAMI)
>Cc : Postgresql-General list
>Objet : Re: [GENERAL] plperl doesn't release memory
>
>
>As I understand it, a single execution of a pl/perl function will not
>be affected by the perl memory issue, so I don't think that is your
>problem.
>
>My guess is that you are reading a large query into perl, so the whole
>thing will be kept in memory (and you can't use more memory than you
>have). For a large query, this can be a huge amount of memory indeed.
>You could use another language like plpgsql that can support
>cursors/looping over query results or, in plperl you could use DBI (not
>spi_exec_query) and loop over query results.
>
>Hope this helps,
>Sean
>
>On Mar 30, 2005, at 9:33 AM, FERREIRA William (COFRAMI) wrote:
>
> > i have a similar problem
>> i'm running PostgreSQL on a PIV with 1GO and Windows 2000 NT
>> i have a large database and a big traitment taking more than 4 hours.
>> during the first hour postgresql use as much memory as virtual memory
>> and i find this strange (growing to more 800MB)
>>
>> and during the execution i get :
>> out of memory
>> Failed on request of size 56
>> and at the end, postgresql use 300 MB of memory and more than 2GB of
>> virtual memory
>>
>> does this problem can be resolve by tuning postgresql settings ?
>> here are my parameters :
>> shared_buffers = 1000
>> work_mem = 131072
>> maintenance_work_mem = 131072
>> max_stack_depth = 4096
>> i tried work_mem with 512MB and 2MB and i get the same error...
>>
>> i read all the post, but i don't know how i can configure perl on
>> Windows...
>>
>> thanks in advance
>>
>> Will
>>
>> -----Message d'origine-----
>> De : pgsql-general-owner(at)postgresql(dot)org
>>
>>[<mailto:pgsql-general-owner(at)postgresql(dot)org>mailto:pgsql-general-owner(at)postgresql(dot)org]De
>>la part de Dan Sugalski
>> Envoyé : vendredi 25 mars 2005 19:34
>> À : Greg Stark; pgsql-general(at)postgresql(dot)org
>> Objet : Re: [GENERAL] plperl doesn't release memory
>>
>>
>>
>> At 6:58 PM -0500 3/24/05, Greg Stark wrote:
>> >Dan Sugalski <dan(at)sidhe(dot)org> writes:
>> >
>> >> Anyway, if perl's using its own memory allocator you'll want to
>> rebuild it
>> >> to not do that.
>> >
>> >You would need to do that if you wanted to use a debugging malloc.
>> But there's
>> >no particular reason to think that you should need to do this just to
>> work
>> >properly.
>> >
>> >Two mallocs can work fine alongside each other. They each call mmap
>> or sbrk to
>> >allocate new pages and they each manage the pages they've received.
>> They won't
>> >have any idea why the allocator seems to be skipping pages, but they
>> should be
>> >careful not to touch those pages.
>>
>> Perl will only use a single allocator, so there's not a huge issue
>> there. It's either the external allocator or the internal one, which
>> is for the best since you certainly don't want to be handing back
>> memory to the wrong allocator. That way lies madness and unpleasant
>> core files.
>>
>> The bigger issue is that perl's memory allocation system, the one you
>> get if you build perl with usemymalloc set to yes, never releases
>> memory back to the system -- once the internal allocator gets a chunk
>> of memory from the system it's held for the duration of the process.
>> This is the right answer in many circumstances, and the allocator's
>> pretty nicely tuned to perl's normal allocation patterns, it's just
>> not really the right thing in a persistent server situation where
>> memory usage bounces up and down. It can happen with the system
>> allocator too, though it's less likely.
>>
>> One of those engineering tradeoff things, and not much to be done
>> about it really.
>> --
>> Dan
>>
>> --------------------------------------it's like this-------------------
>> Dan Sugalski even samurai
>> dan(at)sidhe(dot)org have teddy bears and even
>> teddy bears get drunk
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 8: explain analyze is your friend
>>
>> This mail has originated outside your organization,
>> either from an external partner or the Global Internet.
>> Keep this in mind if you answer this message.
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 9: the planner will ignore your desire to choose an index scan if your
> joining column's datatypes do not match
>
>This mail has originated outside your organization,
>either from an external partner or the Global Internet.
>Keep this in mind if you answer this message.

--
Dan

--------------------------------------it's like this-------------------
Dan Sugalski even samurai
dan(at)sidhe(dot)org have teddy bears and even
teddy bears get drunk

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2005-03-31 15:17:13 Re: truncate/create slowness
Previous Message Greg Stark 2005-03-31 14:41:35 Re: Debugging deadlocks