Skip site navigation (1) Skip section navigation (2)

Re: Index bloat problem?

From: David Roussel <pgsql-performance(at)diroussel(dot)xsmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bill Chandler <billybobc1210(at)yahoo(dot)com>,pgsql-perform <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Index bloat problem?
Date: 2005-04-22 08:44:05
Message-ID: 869771629a1d24a91b8df779f7cddfe9@diroussel.xsmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
On 22 Apr 2005, at 06:57, Tom Lane wrote:
> Bill Chandler <billybobc1210(at)yahoo(dot)com> writes:
>> Client is reporting that the size of an index is
>> greater than the number of rows in the table (1.9
>> million vs. 1.5 million).
>
> This thread seems to have wandered away without asking the critical
> question "what did you mean by that?"
>
> It's not possible for an index to have more rows than there are in
> the table unless something is seriously broken.  And there aren't
> any SQL operations that let you inspect an index directly anyway.
> So: what is the actual observation that led you to the above
> conclusion?  Facts, please, not inferences.

I work for the client in question. Glad you picked up on that point.  I 
covered the detail in my my post "How can an index be larger than a 
table" on 21 Apr. 2005.  I guess I was too detailed, and too much info 
put people off.
http://archives.postgresql.org/pgsql-performance/2005-04/msg00553.php

quoting from there...

|
|SELECT relname, relkind, reltuples, relpages FROM pg_class WHERE
relname LIKE 'dave_data%';
|
|relname                                relkind reltuples relpages
|======================================= ======= ========= ========
|dave_data_update_events                r       1593600.0 40209
|dave_data_update_events_event_id_key   i       1912320.0 29271
|dave_data_update_events_event_id_seq   S       1.0       1
|dave_data_update_events_lds_idx        i       1593600.0 6139
|dave_data_update_events_obj_id_idx     i       1593600.0 6139
|iso_pjm_data_update_events_obj_id_idx  i       1593600.0 6139
|

Note that there are only 1593600 rows in the table, so why the 1912320
figure?

Of course I checked that the row count was correct...

|
|EXPLAIN ANALYZE
|select count(*) from iso_pjm_data_update_events
|
|QUERY PLAN
|Aggregate  (cost=60129.00..60129.00 rows=1 width=0) (actual
time=35933.292..35933.293 rows=1 loops=1)
|  ->  Seq Scan on iso_pjm_data_update_events  (cost=0.00..56145.00
rows=1593600 width=0) (actual time=0.213..27919.497 rows=1593600
loops=1)
|Total runtime: 35933.489 ms
|

and...

|
|select count(*) from iso_pjm_data_update_events
|
|count
|1593600
|

so it's not that there are any undeleted rows lying around


In response to

Responses

pgsql-performance by date

Next:From: Richard van den BergDate: 2005-04-22 09:04:17
Subject: Re: When are index scans used over seq scans?
Previous:From: Dave PageDate: 2005-04-22 08:08:01
Subject: Re: Joel's Performance Issues WAS : Opteron vs Xeon

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group