Skip site navigation (1) Skip section navigation (2)

Re: default index created for primary key

From: "Frank D(dot) Engel, Jr(dot)" <fde101(at)fjrhome(dot)net>
To: General PostgreSQL list <pgsql-general(at)postgresql(dot)org>
Subject: Re: default index created for primary key
Date: 2004-12-22 19:00:12
Message-ID: B510C398-544B-11D9-AF71-0050E410655F@fjrhome.net (view raw or flat)
Thread:
Lists: pgsql-general
On Dec 22, 2004, at 12:09 PM, vinita bansal wrote:
>
> I am still not clear on why postgres has this restriction?
> By uniqueness, you mean to say that if later anyone wants to add a 
> primary key constraint on a table which already has a primary key 
> defined, postgres will use this index to determine that there is 
> already a primary key defined and would not allow to add this 
> constraint since a table cannot have two primary keys??

No, an index is required for efficiency.

Consider a table with a column which must be unique.  Assume there are 
350,000 rows in the table.  Now *every time* you insert a new row or 
perform an update which changes that unique column, assuming no index 
on the column, the database would need to check all 350,000 rows 
individually to determine that the value is in fact unique.

With an index on the column, it is relatively quick for the database to 
determine that the value is unique, as it does not need to check nearly 
as many values..

To see this (rough example), start with an empty table with a single 
column, which is a unique integer column.  Now add values and watch 
what happens to an index (use a fixed-width font):

4 53 72 15 23 19 3 12 8

Index:

4

4
  \
   53

   53
  /  \
4    72

   53
  /  \
4    72
  \
   15

      53
     /  \
   15    72
  /  \
4    23

      19
     /  \
   15    53
  /  \     \
4    23    72

       15
      /  \
     /    \
    /      \
   4        53
  / \      /  \
3   23  19    72

       15
      /  \
     /    \
    /      \
   4        53
  / \      /  \
3   23  19    72
  \
   12

         15
        /  \
       /    \
      /      \
     4        53
    / \      /  \
   8   23  19    72
  / \
3   12


Now the user tries to insert 12, which is already in the table.  In 
order to determine that 12 is in the table, the database could scan 
every value in the table until it finds it.  In this case, it would 
need to check 8 rows.  Using the index, it would only need to check 4 
values, cutting the time in half.  In a few cases, it may take 
marginally longer (2 as opposed to 1 for the value 4), but on average, 
5 rows for unindexed vs. 2.8 rows for indexed, shows that the index has 
a definite advantage.

Now extend this to 350,000 rows.  Without an index, you'd need to check 
an average of about 175,000 rows just to determine that a value was 
there.  And if the value is not there, as will more commonly be the 
case, you'd need to check them all.  With an index like the ones I used 
above, you would need to check *at most* 19 values.  You begin to see 
why PostgreSQL requires an index here.

-----------------------------------------------------------
Frank D. Engel, Jr.  <fde101(at)fjrhome(dot)net>

$ ln -s /usr/share/kjvbible /usr/manual
$ true | cat /usr/manual | grep "John 3:16"
John 3:16 For God so loved the world, that he gave his only begotten 
Son, that whosoever believeth in him should not perish, but have 
everlasting life.
$

In response to

pgsql-general by date

Next:From: Joshua D. DrakeDate: 2004-12-22 19:02:30
Subject: Re: nice work on the new site
Previous:From: Vivek KheraDate: 2004-12-22 18:46:21
Subject: Re: What HW / OS is recommeded

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group