Re: Updates are slow..

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Tom Burke" <lists(at)spamex(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Updates are slow..
Date: 2002-06-12 15:00:47
Message-ID: 9762.1023894047@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

"Tom Burke" <lists(at)spamex(dot)com> writes:
> After creating this index, the update time jumped up again to 21 mins!
> real 21m3.536s
> user 0m0.010s
> sys 0m0.000s

> This makes no sense to me because the new index is not related to
> the column being updated at all.

Doesn't matter: we're entering a new tuple (new version of the row) so
new entries must be made in all indexes for the table.

> Actually, it is not unique.
> This is not really an emp table, and in fact ~600K of the email
> addresses are duplicates.

Ah so. The btree code doesn't really like huge numbers of duplicate
keys --- it still works, but the algorithms degenerate to O(N) instead
of O(log N). I imagine the previous two-column incarnation with
(email, dept_id) was also quite non-unique?

So the bottom line is not to bother with making indexes on highly
redundant columns; they aren't fast and they don't do you any good
anyway.

If the situation is that you've got, say, 600K empty email addresses and
another 600K that are actually useful, you might consider making a
partial index that excludes the empty addresses; then that could be used
to look up real addresses without paying the overhead to index the empty
ones. See
http://www.postgresql.org/users-lounge/docs/7.2/postgres/indexes-partial.html

regards, tom lane

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Manfred Koizar 2002-06-12 15:08:40 Re: Updates are slow..
Previous Message Tom Lane 2002-06-12 14:41:13 Re: dynamic querys