pl/python long-lived allocations in datum->dict transformation

From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: pl/python long-lived allocations in datum->dict transformation
Date: 2012-02-05 18:54:11
Message-ID: 4F2ED053.1010904@wulczer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Consider this:

create table arrays as select array[random(), random(), random(),
random(), random(), random()] as a from generate_series(1, 1000000);

create or replace function plpython_outputfunc() returns void as $$
c = plpy.cursor('select a from arrays')
for row in c:
pass
$$ language plpythonu;

When running the function, every datum will get transformed into a
Python dict, which includes calling the type's output function,
resulting in a memory allocation. The memory is allocated in the SPI
context, so it accumulates until the function is finished.

This is annoying for functions that plough through large tables, doing
some calculation. Attached is a patch that does the conversion of
PostgreSQL Datums into Python dict objects in a scratch memory context
that gets reset every time.

Cheers,
Jan

Attachment Content-Type Size
plpython-tuple-to-dict-leak.patch text/x-diff 4.2 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Urbański 2012-02-05 19:07:22 plpgsql leaking memory when stringifying datums
Previous Message Jeff Davis 2012-02-05 18:53:20 Re: initdb and fsync