Re: [HACKERS] Out of memory problem (forwarded bug report)

From: Vladimír Beneš <Vladimir(dot)Benes(at)pvt(dot)cz>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-hackers(at)postgreSQL(dot)org>, Mühlpachr Michal <michalm(at)pvt(dot)net>
Subject: Re: [HACKERS] Out of memory problem (forwarded bug report)
Date: 2000-02-24 12:06:08
Message-ID: 010101bf7ebf$888368f0$451c11ac@p53apk.chv.pvt.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

thank You very much, You helped me :-)

Well, I will use function date() and not date_trunc() in my views.

I tryed change type of "bytes" column from int8 to int4. Range 0 .. +2
147 483 647 will suffice here. But I must retype this type to int8 in sum
function because overflow can occur:

collector=> create table flow_sums (
collector-> primary_collector varchar(50) not null,
collector-> start datetime not null,
collector-> end_period datetime not null,
collector-> dead_time_rel float4 not null,
collector-> src_name varchar(50) not null,
collector-> dst_name varchar(50) not null,
collector-> bytes int4 not null,
collector-> packets int4 not null);
CREATE
collector=> insert into
flow_sums(primary_collector,start,end_period,dead_time_rel,src_name,dst_name
,bytes,packets)
collector-> select
primary_collector,start,end_period,dead_time_rel,src_name,dst_name,bytes,pac
kets
collector-> from flow_sums_200002;
INSERT 0 3198588
collector=> create unique index flow_sums_pk on
flow_sums(primary_collector,start, src_name, dst_name);
CREATE
collector=>
collector=> select sum(bytes) from flow_sums;
sum
-----------
-1712976144
(1 row)
collector=> select sum(int8(bytes)) from flow_sums;
sum
------------
603877412592
(1 row)

collector=>

So my views will retype attributes in sums to int8 and they will use
function date to round datetime attribute.

Thank You very much and please forward this mail to
pgsql-hackers(at)postgreSQL(dot)org becouse the robot will bounce my CC.

Thanks, V. Benes

-----Původní zpráva-----
Od: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Komu: Vladimír Beneš <Vladimir(dot)Benes(at)pvt(dot)cz>
Kopie: pgsql-hackers(at)postgreSQL(dot)org <pgsql-hackers(at)postgreSQL(dot)org>;
Mühlpachr Michal <michalm(at)pvt(dot)net>
Datum: 24. února 2000 6:46
Předmět: Re: [HACKERS] Out of memory problem (forwarded bug report)

>Vladimir,
> Thanks for the details. I think you are undoubtedly running into
>expression evaluation memory leaks. Basically, any expression that
>yields a non-pass-by-value data type consumes memory that is not
>reclaimed until end of statement --- so when you process a few million
>rows, that memory starts to add up. (Yes, I realize this is a horrible
>misfeature. It's on our TO-DO list to fix it, but it probably won't
>happen until 7.1 or 7.2.) In the meantime the best I can offer you
>is workarounds.
>
> I think the major problems here are coming from the
>"date_trunc('day',start)" calculation (because its datetime result is
>pass-by-reference) and to a lesser extent from the sum(bytes)
>calculation (because int8 is pass-by-reference). You could easily
>replace "date_trunc('day',start)" with "date(start)"; since date is
>a pass-by-value type, that won't leak memory, and it should give
>equivalent results. The int8 sum is not quite so easy to fix.
>I assume you can't get away with switching to int4 --- probably
>your sum would overflow an int4? It may be that just fixing the
>inefficient date_trunc calc will reduce your memory requirements
>enough to get by. If not, the only good news I have is that release
>7.0 does fix the memory-leak problem for internal calculations of
>aggregate functions like sum(). You can get the first beta release
>for 7.0 now.
>
> regards, tom lane
>
>
>"=?iso-8859-2?B?VmxhZGlt7XIgQmVuZbk=?=" <Vladimir(dot)Benes(at)pvt(dot)cz> writes:
>> -----Původní zpráva-----
>> Od: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
>> Komu: Oliver Elphick <olly(at)lfix(dot)co(dot)uk>
>> Kopie: pgsql-hackers(at)postgreSQL(dot)org <pgsql-hackers(at)postgreSQL(dot)org>;
>> Vladimir(dot)Benes(at)pvt(dot)cz <Vladimir(dot)Benes(at)pvt(dot)cz>
>> Datum: 22. února 2000 18:06
>> Předmět: Re: [HACKERS] Out of memory problem (forwarded bug report)
>
>
>>> "Oliver Elphick" <olly(at)lfix(dot)co(dot)uk> writes:
>>>> Can someone advise, please, how to deal with this problem in 6.5.3?
>>>
>
>>> My guess is that the cause is memory leaks during expression evaluation;
>>> but without seeing the complete view definitions and underlying table
>>> definitions, it's impossible to know what processing is being invoked
>>> by this query...
>>>
>>> regards, tom lane
>
>
>
>> Well, I will append views and underlying table definition:
>
>> 1) Once again - failure query:
>> select comm_type,name,tot_bytes,tot_packets
>> from flow_sums_days_send_200002_view
>> where day='2000-02-21' and name not like '@%'
>> union all
>> select comm_type,name,tot_bytes,tot_packets
>> from flow_sums_days_receive_200002_view
>> where day='2000-02-21' and name not like '@%'
>
>> 2) views definition:
>> create view flow_sums_days_send_200002_view as
>> select
>> 'send'::varchar as comm_type, date_trunc('day',start) as day,
>> src_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets
>> from flow_sums_200002
>> group by day, src_name
>
>> create view flow_sums_days_receive_200002_view as
>> select
>> 'receive'::varchar as comm_type, date_trunc('day',start) as day,
>> dst_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets
>> from flow_sums_200002
>> group by day, dst_name
>
>
>> I wanted create only one usefull view:
>
>> create view flow_sums_days_200002_view as
>> select
>> 'send'::varchar as comm_type, date_trunc('day',start) as day,
>> src_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets
>> from flow_sums_200002
>> group by day, src_name
>> UNION ALL
>> select
>> 'receive'::varchar as comm_type, date_trunc('day',start) as day,
>> dst_name as name, sum(bytes) as tot_bytes, sum(packets) as tot_packets
>> from flow_sums_200002
>> group by day, dst_name
>
>> ...but Postgres cann't use clause UNION ALL at view definition. So I
created
>> two views mentioned above and I wanted use this ones with UNION ALL
clause
>> only.
>
>> 3) underlaying table definition:
>> create table flow_sums_200002 (
>> primary_collector varchar(50) not null,
>> start datetime not null,
>> end_period datetime not null,
>> dead_time_rel float4 not null,
>> src_name varchar(50) not null,
>> dst_name varchar(50) not null,
>> bytes int8 not null,
>> packets int4 not null
>> )
>
>> Today this table has about 3 000 000 rows and the select command
>> mentioned above returns 190 + 255 rows.
>
>
>> Now I don't use clause "UNION ALL" and the program executes two
queryes
>> and then adds both result to new result. I reduced time increment of
number
>> rows to flow_sums_200002 table (three times less). This table contains
data
>> of February 2000 and the program will create table flow_sums_200003 with
>> relevant views next month.
>> Well, now this solution solve my problem but always depends on number
of
>> rows - I only moved limit of rows count.
>
>
>> Thank You, V. Benes
>
>> P.S.: I append part of top on my system while the query is running:
>
>> CPU states: 98.6% user, 1.3% system, 0.0% nice, 0.0% idle
>> Mem: 127256K av, 124316K used, 2940K free, 29812K shrd, 2620K buff
>> Swap: 128516K av, 51036K used, 77480K free 7560K
cached
>
>> PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME
COMMAND
>> 2942 postgres 20 0 141M 99M 17348 R 0 99.0 80.4 1:22
postmaster
>
>> => postmaster later took 80 - 95% of memory, free memory decressed to 2
MB,
>> CPU was overloaded (0% idle and 99% by user process of postmaster). Have
You
>> ever seen something similar :-) ?

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2000-02-24 12:47:36 Re: [BUGS] First experiences with Postgresql 7.0
Previous Message Oliver Elphick 2000-02-24 11:22:38 Re: [HACKERS] Out of memory problem (forwarded bug report)