Skip site navigation (1) Skip section navigation (2)

Re: Questions about indexes?

From: Curt Sampson <cjs(at)cynic(dot)net>
To: Ryan Bradetich <rbradetich(at)uswest(dot)net>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Questions about indexes?
Date: 2003-02-20 11:30:50
Message-ID: Pine.NEB.4.51.0302202020210.474@angelic-vtfw.cvpn.cynic.net (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-performance
On Wed, 19 Feb 2003, Ryan Bradetich wrote:

> 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user x has an invalid shell.
> 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has an invalid shell.
> 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has expired password.
> 2 | Mon Feb 17 00:34:24 MST 2003 | f101 | file /foo has improper owner.

If you're going to normalize this a bit, you should start looking at
the data that are repeated and trying to get rid of the repititions.
First of all, the timestamp is repeated a lot, you might move that to a
separate table and just use a key into that table. But you might even
do better with multiple columns: combine the timestamp and host ID into
one table to get a "host report instance" and replace both those columns
with just that. If host-id/timestamp/category triplets are frequently
repeated, you might even consider combining the three into another
table, and just using an ID from that table with each anomaly.

But the biggest space and time savings would come from normalizing your
anomalys themselves, because there's a huge amount repeated there. If you're
able to change the format to something like:

    invalid shell for user: x
    invalid shell for user: y
    expired password for user: y
    improper owner for file: /foo

You can split those error messages off into another table:

    anomaly_id | anomaly
    -----------+------------------------------------------------
             1 | invalid shell for user
	     2 | expired password for user
	     3 | improper owner for file

And now your main table looks like this:

    host_id | timestamp                    | ctgr | anomaly_id | datum
    --------+------------------------------+------+------------+------
          1 | Mon Feb 17 00:34:24 MST 2003 | p101 |          1 | x
          1 | Mon Feb 17 00:34:24 MST 2003 | p101 |          1 | y
          1 | Mon Feb 17 00:34:24 MST 2003 | p101 |          2 | y
          2 | Mon Feb 17 00:34:24 MST 2003 | f101 |          3 | /foo

cjs
-- 
Curt Sampson  <cjs(at)cynic(dot)net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC

In response to

pgsql-performance by date

Next:From: Josh BerkusDate: 2003-02-20 22:33:02
Subject: Re: Tuning scenarios (was Changing the default configuration)
Previous:From: Mike NielsenDate: 2003-02-20 10:05:32
Subject: Re: Peluang Usaha yang Luar Biasa

pgsql-hackers by date

Next:From: UrosDate: 2003-02-20 11:57:43
Subject: Re: [OpenFTS-general] Alpha-2 of contrib/tsearch
Previous:From: Teodor SigaevDate: 2003-02-20 10:37:34
Subject: Alpha-2 of contrib/tsearch

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group