Re: Questions about indexes?

From: Curt Sampson <cjs(at)cynic(dot)net>
To: Ryan Bradetich <rbradetich(at)uswest(dot)net>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Questions about indexes?
Date: 2003-02-20 11:30:50
Message-ID: Pine.NEB.4.51.0302202020210.474@angelic-vtfw.cvpn.cynic.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

On Wed, 19 Feb 2003, Ryan Bradetich wrote:

> 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user x has an invalid shell.
> 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has an invalid shell.
> 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has expired password.
> 2 | Mon Feb 17 00:34:24 MST 2003 | f101 | file /foo has improper owner.

If you're going to normalize this a bit, you should start looking at
the data that are repeated and trying to get rid of the repititions.
First of all, the timestamp is repeated a lot, you might move that to a
separate table and just use a key into that table. But you might even
do better with multiple columns: combine the timestamp and host ID into
one table to get a "host report instance" and replace both those columns
with just that. If host-id/timestamp/category triplets are frequently
repeated, you might even consider combining the three into another
table, and just using an ID from that table with each anomaly.

But the biggest space and time savings would come from normalizing your
anomalys themselves, because there's a huge amount repeated there. If you're
able to change the format to something like:

invalid shell for user: x
invalid shell for user: y
expired password for user: y
improper owner for file: /foo

You can split those error messages off into another table:

anomaly_id | anomaly
-----------+------------------------------------------------
1 | invalid shell for user
2 | expired password for user
3 | improper owner for file

And now your main table looks like this:

host_id | timestamp | ctgr | anomaly_id | datum
--------+------------------------------+------+------------+------
1 | Mon Feb 17 00:34:24 MST 2003 | p101 | 1 | x
1 | Mon Feb 17 00:34:24 MST 2003 | p101 | 1 | y
1 | Mon Feb 17 00:34:24 MST 2003 | p101 | 2 | y
2 | Mon Feb 17 00:34:24 MST 2003 | f101 | 3 | /foo

cjs
--
Curt Sampson <cjs(at)cynic(dot)net> +81 90 7737 2974 http://www.netbsd.org
Don't you know, in this new Dark Age, we're all light. --XTC

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Uros 2003-02-20 11:57:43 Re: [OpenFTS-general] Alpha-2 of contrib/tsearch
Previous Message Teodor Sigaev 2003-02-20 10:37:34 Alpha-2 of contrib/tsearch

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2003-02-20 22:33:02 Re: Tuning scenarios (was Changing the default configuration)
Previous Message Mike Nielsen 2003-02-20 10:05:32 Re: Peluang Usaha yang Luar Biasa