Re: Questions about indexes?

From: Ryan Bradetich <rbradetich(at)uswest(dot)net>
To: Josh Berkus <josh(at)agliodbs(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Questions about indexes?
Date: 2003-02-19 08:10:53
Message-ID: 1045642252.17975.116.camel@beavis.ybsoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Josh,

Posting to the performance list as requested :) The reason I orgionally
posted to the hackers list was I was curious about the contents of the
index and how they worked.... but now this thread is more about
performance, so this list is more appropriate.

On Tue, 2003-02-18 at 10:37, Josh Berkus wrote:
> Ryan,
>
> I am crossing this discussion to the PGSQL-PERFORMANCE list, which is the
> proper place for it. Anyone interested, please follow us there!
>
> >>>Ryan Bradetich said:
> > the table would look like:
> > 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user x has an invalid shell.
> > 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has an invalid shell.
> > 1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has expired password.
> > 2 | Mon Feb 17 00:34:24 MST 2003 | f101 | file /foo has improper owner.
> > etc...
> >
> > So I do not need the anomaly to be part of the index, I only need it to
> >
> > I agree with you, that I would not normally add the anomally to the
> > index, except for the unique row requirement. Thinking about it now,
> > maybe I should guarentee unique rows via a check constraint...
> >
> > Thanks for making me think about this in a different way!
>
> First off, I'm not clear on why a duplicate anominaly would be necessarily
> invalid, given the above. Not useful, certainly, but legitimate real data.

Duplicate anomalies are not invalid, they are only invalid if they are
for the same system, same category, from the same compliancy report.

ie. This is bad:
1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user x has an invalid shell.
1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user x has an invalid shell.

This is ok:
1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user x has an invalid shell.
1 | Mon Feb 17 00:34:24 MST 2003 | p101 | user y has an invalid shell.

The only reason the duplicate date would occur is if the same compliancy
report was entered into the database twice. (ie. The host did not
generate a new compliancy report, or a bug in the data stuffing script,
etc). Daniel Kalchev from the pgsql-hackers list had a good idea that I
am investigating, which is to have the archive script be responsible for
preventing duplicate entries into the database, thus the requirement for
an index to do this is eliminated.

The whole reason I decided to not allow duplicte entries into the
architve table is so when I generate reports, I do not have to put the
distinct qualifier on the queries which eliminates the sort and speeds
up the queries. The entire purpose of the index was to maintain good
data integrity in the archive tables for reporting purposes. If I can
enforce the data integrity another way (ie, data stuffer scripts, index
on md5 hash of the data, etc) then I can drop this huge index and be
happy :)

> I realize that you have performance considerations to work with. However, I
> must point out that your database design is not *at all* normalized ... and
> that normalizing it might solve some of your indexing problems.

Please do point out these design errors! I am always interested in
learning more about normialization since I do not have any formal DBA
training, and most of my knowledge is from reading books, mailing lists,
and experimenting :)

> A normalized structure would look something like:
>
> TABLE operations
> id serial not null primary key,
> host_id int not null,
> timeoccurred timestamp not null default now(),
> category varchar(5) not null,
> constraint operations_unq unique (host_id, timeoccurred, category)
>
> TABLE anominalies
> id serial not null primary key,
> operation_id int not null references operations(id) on delete cascade,
> anominaly text
>
> This structure would have two advantages for you:
> 1) the operations table would be *much* smaller than what you have now, as you
> would not be repeating rows for each anominaly.

I agree the operations table would be smaller, but you have also added
some data by breaking it up into 2 tables. You have an oid (8 bytes) +
operations.id (4 bytes) + anomalies.id (4 bytes) + operation_id (4
bytes) + tuple overhead[2] (36 bytes).

The anomalies.id and operation_id will be duplicated for all
~85 Millon rows[1], but we did remove the host_id (4 bytes), timestamp
(8 bytes) and category (~5 bytes) ... for a savings of 9 bytes / row.

My quick estimation shows this saves ~ 730 MB in the table and the index
for a combined total of 1.46 GB (The reason the index savings in the
anomalies is not more is explain further in response to point 2.).

So to gain any space savings from breaking the tables up, the total size
of the operations table + primary index + operatoins_unq index < 1.46
GB.

The operations table contains oid (8 bytes) + host_id (4 bytes) +
timestamp (8 bytes) + category (~5 bytes) + tuple overhead[2] (36
bytes). Also since every category is duplicated in either the primary
index or the operations_unq index, the index sizes will be approximately
the size of the main table.

So 730 MB / (61 bytes) == ~ 12.5 Million rows.

A quick calculation of hosts * categories * data points show that we
could have a maximum of ~ 12 million entries[3] in the operation table,
so this would save some space :)

> 2) In neither table would you be including the anominaly text in an index ...
> thus reducing index size tremendously.

Unless I impliment Daniel's method of verifying the uniqness at the data
insertion point, I will still need to guarentee the anomaly is unique
for the given operation_id. If I mis-read the table schema, would you
please point it out to me .. I'm probably being dense :)

Also, I do not understand why the anomalies table need the id key for
the primary key. Wouldn't the operation_id and the anomaly form the
primary key? We could save 8 bytes (4 from table + 4 from index) * ~85
Million rows by removing this column.

> As a warning, though: you may find that for insert speed the referential
> integrity key is better maintained at the middleware layer. We've had some
> complaints about slow FK enforcement in Postgres.

Thanks, I will keep this in mind. Although the inserts are usually done
in a batch job ... so interactive speed is generally not an issue ...
but faster is always better :)

Also I am curious ... With the table more nomalized, I now need to
perform a join when selecting data.... I realize there will be fewer
pages to read from disk (which is good!) when doing this join, but I
will interested to see how the join affects the interactive performance
of the queries.... something to test :)

Thanks for looking at this, Josh, and providing input. Hopefully by
explaining my figuring and thinking you can see what am I looking at ...
and point out additional flaws in my methods :) Unfortunately I still
do not think I can remove the anomaly field from the index yet, even by
normalizing the tables like you did.

I think Daniel has the correct answer by moving the unique constraint
check out into the stuffer script, or by performing some method of
indexing on a hash as I proposed at the beginning of the thread.

I have figured out a way to reduce my md5 index size in 1/2 again, and
have deveoped a method for dealing with collisions too. I am planning
on running some bench marks against the current method, with the tables
normalized as Josh suggested, and using the md5 hash I am working on. My
benchmarks will be fairly simple ... average time to insert X number of
valid inserts and average time to insert X number of duplicate inserts
along with disk space usage. If anyone is interested I am willing to
post the results to this list ... and if anyone has some other benchmark
suggestions they would like to see, feel free to let me know.

Thanks much for all the help and insight!

- Ryan

[1] select count(anomaly) from history;
count
----------
85221799

[2] I grabed the 36 bytes overhead / tuple from this old FAQ I found
at http://www.guides.sk/postgresql_docs/faq-english.html
I did not look at what the current value is today.

[3] This was a rough calculation of a maxium, I do not believe we are
at this maxium, so the space savings is most likely larger.

--
Ryan Bradetich <rbradetich(at)uswest(dot)net>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Oliver Elphick 2003-02-19 09:15:39 Re: The last configuration file patch (I hope!) This one
Previous Message Dennis Björklund 2003-02-19 06:57:01 translation stats

Browse pgsql-performance by date

  From Date Subject
Next Message Chantal Ackermann 2003-02-19 09:38:54 Re: cost and actual time
Previous Message incomingforward 2003-02-19 04:56:40 pgsql-hackers-owner, LIVE FROM WALL STREET: VICC Test Results Are In..........