I am currently experimenting using plpythonu with postgresql 8.0 for Win32. It's basically a quick script that imports data from CSV files, but does some quite complicated data lookups and selections. The area in which I'm somewhat confunded is memory usage. The process successfully runs, but keeps climbing in memory usage relentlessly, successfully importing about 200k records, before the memory usage of postgres soars to 2gb and of course, shortly after that, grinds to a halt with a "Out of Memory" error.
I'm not deliberately storing anything in the SD/GD dictionaries, and am not dealing with triggers...
create function blah(text) returns int8 as
# Initialisation of plans
myplan = plpy.prepare('insert into tablename (var1,var2) values($1,$2)', ['text','text'] )
# Setup external CSV data source
# For each record, that meets certain critera, execute insert...
for rec in dsource: plpy.execute(myplan, [Value1, Value2] )
# Finishing stuff
Given I'm importing about 250 million records and only want to end up with about 4 million, is
1) This possible using the above?
2) Better suited to something else (I've looked at COPY but that would require the entire table be uploaded first, then filtered and I'd like to avoid that if necessary, or thinking about it, I spose a trigger could be written that responded on the copy?). Also the other thing is COPY is only applicable to simple text files, while I want this import script to be generic from whatever datasource it may be importing from (an ODBC/DBF/Berkeley DB format etc...)
Anyhow, thanks in advance for any help.
Any Qs, please gimme a yell.
This e-mail has been scanned for all viruses by Star. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
pgsql-interfaces by date
|Next:||From: Tom Lane||Date: 2005-11-09 16:52:35|
|Subject: Re: PLPythonU & Out of Memory - Importing Query |
|Previous:||From: Greg Sabino Mullane||Date: 2005-11-07 13:01:15|
|Subject: Re: DBD::Pg returns 1/0 for boolean field ...|