Re: [HACKERS] Out of memory problem (forwarded bug report)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Vladimír Beneš <Vladimir(dot)Benes(at)pvt(dot)cz>
Cc: pgsql-hackers(at)postgreSQL(dot)org, Mühlpachr Michal <michalm(at)pvt(dot)net>
Subject: Re: [HACKERS] Out of memory problem (forwarded bug report)
Date: 2000-02-24 05:45:56
Message-ID: 14205.951371156@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Vladimir,
Thanks for the details. I think you are undoubtedly running into
expression evaluation memory leaks. Basically, any expression that
yields a non-pass-by-value data type consumes memory that is not
reclaimed until end of statement --- so when you process a few million
rows, that memory starts to add up. (Yes, I realize this is a horrible
misfeature. It's on our TO-DO list to fix it, but it probably won't
happen until 7.1 or 7.2.) In the meantime the best I can offer you
is workarounds.

I think the major problems here are coming from the
"date_trunc('day',start)" calculation (because its datetime result is
pass-by-reference) and to a lesser extent from the sum(bytes)
calculation (because int8 is pass-by-reference). You could easily
replace "date_trunc('day',start)" with "date(start)"; since date is
a pass-by-value type, that won't leak memory, and it should give
equivalent results. The int8 sum is not quite so easy to fix.
I assume you can't get away with switching to int4 --- probably
your sum would overflow an int4? It may be that just fixing the
inefficient date_trunc calc will reduce your memory requirements
enough to get by. If not, the only good news I have is that release
7.0 does fix the memory-leak problem for internal calculations of
aggregate functions like sum(). You can get the first beta release
for 7.0 now.

regards, tom lane

"=?iso-8859-2?B?VmxhZGlt7XIgQmVuZbk=?=" <Vladimir(dot)Benes(at)pvt(dot)cz> writes:
> -----Pvodn zprva-----
> Od: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
> Komu: Oliver Elphick <olly(at)lfix(dot)co(dot)uk>
> Kopie: pgsql-hackers(at)postgreSQL(dot)org <pgsql-hackers(at)postgreSQL(dot)org>;
> Vladimir(dot)Benes(at)pvt(dot)cz <Vladimir(dot)Benes(at)pvt(dot)cz>
> Datum: 22. nora 2000 18:06
> Pedmt: Re: [HACKERS] Out of memory problem (forwarded bug report)

>> "Oliver Elphick" <olly(at)lfix(dot)co(dot)uk> writes:
>>> Can someone advise, please, how to deal with this problem in 6.5.3?
>>

>> My guess is that the cause is memory leaks during expression evaluation;
>> but without seeing the complete view definitions and underlying table
>> definitions, it's impossible to know what processing is being invoked
>> by this query...
>>
>> regards, tom lane

> Well, I will append views and underlying table definition:

> 1) Once again - failure query:
> select comm_type,name,tot_bytes,tot_packets
> from flow_sums_days_send_200002_view
> where day='2000-02-21' and name not like '@%'
> union all
> select comm_type,name,tot_bytes,tot_packets
> from flow_sums_days_receive_200002_view
> where day='2000-02-21' and name not like '@%'

> 2) views definition:
> create view flow_sums_days_send_200002_view as
> select
> 'send'::varchar as comm_type, date_trunc('day',start) as day,
> src_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets
> from flow_sums_200002
> group by day, src_name

> create view flow_sums_days_receive_200002_view as
> select
> 'receive'::varchar as comm_type, date_trunc('day',start) as day,
> dst_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets
> from flow_sums_200002
> group by day, dst_name

> I wanted create only one usefull view:

> create view flow_sums_days_200002_view as
> select
> 'send'::varchar as comm_type, date_trunc('day',start) as day,
> src_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets
> from flow_sums_200002
> group by day, src_name
> UNION ALL
> select
> 'receive'::varchar as comm_type, date_trunc('day',start) as day,
> dst_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets
> from flow_sums_200002
> group by day, dst_name

> ...but Postgres cann't use clause UNION ALL at view definition. So I created
> two views mentioned above and I wanted use this ones with UNION ALL clause
> only.

> 3) underlaying table definition:
> create table flow_sums_200002 (
> primary_collector varchar(50) not null,
> start datetime not null,
> end_period datetime not null,
> dead_time_rel float4 not null,
> src_name varchar(50) not null,
> dst_name varchar(50) not null,
> bytes int8 not null,
> packets int4 not null
> )

> Today this table has about 3 000 000 rows and the select command
> mentioned above returns 190 + 255 rows.

> Now I don't use clause "UNION ALL" and the program executes two queryes
> and then adds both result to new result. I reduced time increment of number
> rows to flow_sums_200002 table (three times less). This table contains data
> of February 2000 and the program will create table flow_sums_200003 with
> relevant views next month.
> Well, now this solution solve my problem but always depends on number of
> rows - I only moved limit of rows count.

> Thank You, V. Benes

> P.S.: I append part of top on my system while the query is running:

> CPU states: 98.6% user, 1.3% system, 0.0% nice, 0.0% idle
> Mem: 127256K av, 124316K used, 2940K free, 29812K shrd, 2620K buff
> Swap: 128516K av, 51036K used, 77480K free 7560K cached

> PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND
> 2942 postgres 20 0 141M 99M 17348 R 0 99.0 80.4 1:22 postmaster

> => postmaster later took 80 - 95% of memory, free memory decressed to 2 MB,
> CPU was overloaded (0% idle and 99% by user process of postmaster). Have You
> ever seen something similar :-) ?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeroen van Vianen 2000-02-24 08:35:19 Re: [HACKERS] Re: [PATCHES] Patch for more readable parse error messages
Previous Message Don Baccus 2000-02-24 02:33:36 Re: [HACKERS] Re: about 7.0 LIMIT optimization