summarizing traffic logs

From: Tamas MEZEI <tamas(at)bazmag(dot)hu>
To: pgsql-novice(at)postgresql(dot)org
Subject: summarizing traffic logs
Date: 2004-10-11 00:06:51
Message-ID: 4169CE9B.9010403@bazmag.hu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Hi,

I've just started to use PgSQL, and having some trouble with aggregation
in SQL.

I have a huge-and-growing table described as:

current_traffic (
from_timestamp timestamp without timezone,
to_timestamp timestamp without timezone,
source_host inet,
dest_host inet,
flow_size bigint,
flow_kind smallint,
flow_proto smallint
);

The input is coming from a named pipe, and inserted into the table by a
perl script. I hope the field names are speaking from themselves, but
some additional info:
- source_host and dest_host: we have a campus network in the subnet
x.y.z.0/21 and an university network in the range of x.y.0.0/16, and
this will be inportant on summarizing
- flow_kind is a smallint value with discrete numbers (ie. it can be
one of {1,2,3,4}) and contains the meaning "this flow is web traffic",
"this is mail" etc.
- flow proto can be 6 or 17 (tcp and udp).

Input is coming like crazy (~60000 rows in 15 mins, we have ~2000 hosts)
and I'd like to do some aggregation in every 15 mins to a table like
below, and then truncate the current_traffic log table. (Is it ok, and
is truncating "atomically"?)

quarterly_sum (
from_timestamp timestamp without timezone,
to_timestamp timestamp without timezone,
host inet,
tcp_to_uni_kind1 bigint,
tcp_to_uni_kind2 bigint,
tcp_to_uni_kind3 bigint,
tcp_to_uni_kind4 bigint,
tcp_from_uni_kind1 bigint,
tcp_from_uni_kind2 bigint,
tcp_from_uni_kind3 bigint,
tcp_from_uni_kind4 bigint,
tcp_to_world_kind1 bigint,
tcp_to_world_kind2 bigint,
tcp_to_world_kind3 bigint,
tcp_to_world_kind4 bigint,
tcp_from_world_kind1 bigint,
tcp_from_world_kind2 bigint,
tcp_from_world_kind3 bigint,
tcp_from_world_kind4 bigint,
udp_to_uni_kind1 bigint,
udp_to_uni_kind2 bigint,
udp_to_uni_kind3 bigint,
udp_to_uni_kind4 bigint,
udp_from_uni_kind1 bigint,
udp_from_uni_kind2 bigint,
udp_from_uni_kind3 bigint,
udp_from_uni_kind4 bigint,
udp_to_world_kind1 bigint,
udp_to_world_kind2 bigint,
udp_to_world_kind3 bigint,
udp_to_world_kind4 bigint,
udp_from_world_kind1 bigint,
udp_from_world_kind2 bigint,
udp_from_world_kind3 bigint,
udp_from_world_kind4 bigint
);

Classifying "world" and "university" traffic is quite easy with PgSQL
inet functions, but how should i create the aggregations grouped by the
kind of the flow?

I had some thoughts about creating some views, or using
triggers/cursors, but I'm not that deep in PgSQL to fully understand
every bit.

If anybody would help me solving this problem, that would be highly
appreciated.

Thanks,

Tamas

Browse pgsql-novice by date

  From Date Subject
Next Message Number One 2004-10-11 08:10:02 Q: parameters to functions
Previous Message V i s h a l Kashyap @ [Sai Hertz And Control Systems] 2004-10-09 16:15:31 View or Function as default field