From: | "D(dot) Dante Lorenso" <dante(at)lorenso(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Indexing a Boolean or Null column? |
Date: | 2004-01-04 01:18:34 |
Message-ID: | 3FF769EA.7090508@lorenso.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
I've been debating with a collegue who argues that indexing a
boolean column is a BAD idea and that is will actually slow
down queries.
My plan is to have a table with many rows sharing 'versions'
(version/archive/history) of data where the most current row
is the one where 'is_active' contains a true value.
If the table begins to look like this:
data_id(pk) | data_lookup_key | data_is_active | ...
------------+-----------------+----------------+--------
1 | banana | false | ...
2 | banana | false | ...
3 | banana | false | ...
4 | banana | false | ...
5 | banana | false | ...
6 | banana | false | ...
7 | banana | false | ...
8 | banana | false | ...
9 | banana | true | ...
10 | apple | true | ...
11 | pear | false | ...
12 | pear | false | ...
13 | pear | false | ...
14 | pear | false | ...
15 | pear | false | ...
...
1000000 | pear | true | ...
Will an index on the 'data_is_active' column be used or work
as I expect? I'm assuming that I may have a million entries
sharing the same 'data_lookup_key' and I'll be using that to
search for the active version of the row.
SELECT *
FROM table
WHERE data_lookup_key = 'pear'
AND data_is_active IS TRUE;
Does it make sense to have an index on data_is_active?
Now, I've read that in some databases the index on a column that
has relatively even distribution of values over a small set of values
will not be efficient.
I bet this is in a FAQ somewhere. Can you point me in the right
direction?
Dante
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2004-01-04 04:26:52 | Re: Indexing a Boolean or Null column? |
Previous Message | Chris Trawick | 2004-01-03 06:07:14 | Re: "fun with multipart primary keys" hobby kit |