Re: Interesting question

From: Larry Rosenman <ler(at)lerctr(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Hackers List <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Interesting question
Date: 2001-05-19 03:27:50
Message-ID: 20010518222750.A21941@lerami.lerctr.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

* Larry Rosenman <ler(at)lerctr(dot)org> [010518 21:48]:
> * Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> [010518 21:09]:
> > Larry Rosenman <ler(at)lerctr(dot)org> writes:
> > > EXPLAIN
> > > SELECT ...
> > > FROM attack_db
> > > WHERE (start_time >= now() - '02:00:00'::interval OR
> > > end_time >= now() - '02:00:00'::interval)
> > > AND host(src_ip) NOT IN (select host(ip) from exempt_ips)
> > > AND host(dst_ip) NOT IN (select host(ip) from exempt_ips)
> > > ORDER BY bytes DESC;
> >
> > > NOTICE: QUERY PLAN:
> >
> > > Sort (cost=10870.77..10870.77 rows=5259 width=120)
> > > -> Seq Scan on attack_db (cost=0.00..10358.95 rows=5259 width=120)
> > > SubPlan
> > > -> Seq Scan on exempt_ips (cost=0.00..1.04 rows=4 width=12)
> > > -> Seq Scan on exempt_ips (cost=0.00..1.04 rows=4 width=12)
> >
> >
> > Making use of the indexes on start_time and end_time would be a good
> > thing. The reason it's not doing that is it doesn't think that the
> > expressions "now() - '02:00:00'::interval" reduce to constants. We
> > may have a proper solution for that by the time 7.2 comes out, but
> > in the meantime you could fake it with a function that hides the
> > noncacheable function and operator --- see previous discussions of
> > this identical issue in the archives.
> OK. What would you suggest for the function? I'd like the
> '02:00:00'::interval to be a variable somehow to change the
> interval we're searching. What fills the table is a daemon that is
> looking at the netflow data, and when a packet that matches one of the
> attack profiles comes along, it does an insert into attack_db.
I tried the following function:

--
-- TOC Entry ID 15 (OID 35180)
--
-- Name: "nowminus" (interval) Type: FUNCTION Owner: ler
--

CREATE FUNCTION "nowminus" (interval) RETURNS timestamp with time zone AS 'SELECT now() - $1;' LANGUAGE 'sql';

and the following query:

EXPLAIN
SELECT to_char(start_time,'MM/DD/YY') as mmddyy,
to_char(start_time,'HH24:MI:SS') as hhmmss,
getattack_type(attack_type) as type,
src_router as router,
input_int as ii,
output_int as oi,
src_as as srcas,host(src_ip) || '/' || masklen(src_ip) || ':' ||
format_port(protocol,src_port) as src_address,
dst_as as dstas,host(dst_ip) || '/' || masklen(dst_ip) || ':' ||
format_port(protocol,dst_port) as dst_address,
format_protocol(protocol) as prot,
tos,format_flags(protocol,pr_flags) as tcpflags,
pkts,bytes,
bytes/pkts as bytes_per_packet,
to_char(end_time,'MM/DD/YY') as end_mmddyy,
to_char(end_time,'HH24:MI:SS') as end_hhmmss,
next_hop
FROM attack_db
WHERE (start_time >= nowminus('02:00:00'::interval) OR
end_time >= nowminus('02:00:00'::interval) )
AND host(src_ip) NOT IN (select host(ip) from exempt_ips)
AND host(dst_ip) NOT IN (select host(ip) from exempt_ips)
ORDER BY bytes DESC; ;

And got the following plan:

NOTICE: QUERY PLAN:

Sort (cost=11313.95..11313.95 rows=5497 width=120)
-> Seq Scan on attack_db (cost=0.00..10777.58 rows=5497 width=120)
SubPlan
-> Seq Scan on exempt_ips (cost=0.00..1.04 rows=4 width=12)
-> Seq Scan on exempt_ips (cost=0.00..1.04 rows=4 width=12)

EXPLAIN
>
>
> >
> > The NOT INs are pretty ugly too (and do you need the host() conversion
> > there? Seems like a waste of cycles...). You might be able to live
> > with that if the timestamp condition will always be pretty restrictive,
> > but otherwise they'll be a no go. Consider NOT EXISTS with an index
> > on exempt_ips(ip).
> Yes, because the masks will probably be different each time (this is
> from netflow data from my cisco's). The exempt IP's table is, at the
> moment 4 ip's, so that's quick anyway.
>
> >
> > regards, tom lane
>
> --
> Larry Rosenman http://www.lerctr.org/~ler
> Phone: +1 972-414-9812 E-Mail: ler(at)lerctr(dot)org
> US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
>

--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: ler(at)lerctr(dot)org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2001-05-19 03:29:55 Re: Plans for solving the VACUUM problem
Previous Message Bruce Momjian 2001-05-19 03:17:47 Fix for tablename in targetlist