Re: Index bloat problem?

From: David Roussel <pgsql-performance(at)diroussel(dot)xsmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Bill Chandler <billybobc1210(at)yahoo(dot)com>, pgsql-perform <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Index bloat problem?
Date: 2005-04-22 08:44:05
Message-ID: 869771629a1d24a91b8df779f7cddfe9@diroussel.xsmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 22 Apr 2005, at 06:57, Tom Lane wrote:
> Bill Chandler <billybobc1210(at)yahoo(dot)com> writes:
>> Client is reporting that the size of an index is
>> greater than the number of rows in the table (1.9
>> million vs. 1.5 million).
>
> This thread seems to have wandered away without asking the critical
> question "what did you mean by that?"
>
> It's not possible for an index to have more rows than there are in
> the table unless something is seriously broken. And there aren't
> any SQL operations that let you inspect an index directly anyway.
> So: what is the actual observation that led you to the above
> conclusion? Facts, please, not inferences.

I work for the client in question. Glad you picked up on that point. I
covered the detail in my my post "How can an index be larger than a
table" on 21 Apr. 2005. I guess I was too detailed, and too much info
put people off.
http://archives.postgresql.org/pgsql-performance/2005-04/msg00553.php

quoting from there...

|
|SELECT relname, relkind, reltuples, relpages FROM pg_class WHERE
relname LIKE 'dave_data%';
|
|relname relkind reltuples relpages
|======================================= ======= ========= ========
|dave_data_update_events r 1593600.0 40209
|dave_data_update_events_event_id_key i 1912320.0 29271
|dave_data_update_events_event_id_seq S 1.0 1
|dave_data_update_events_lds_idx i 1593600.0 6139
|dave_data_update_events_obj_id_idx i 1593600.0 6139
|iso_pjm_data_update_events_obj_id_idx i 1593600.0 6139
|

Note that there are only 1593600 rows in the table, so why the 1912320
figure?

Of course I checked that the row count was correct...

|
|EXPLAIN ANALYZE
|select count(*) from iso_pjm_data_update_events
|
|QUERY PLAN
|Aggregate (cost=60129.00..60129.00 rows=1 width=0) (actual
time=35933.292..35933.293 rows=1 loops=1)
| -> Seq Scan on iso_pjm_data_update_events (cost=0.00..56145.00
rows=1593600 width=0) (actual time=0.213..27919.497 rows=1593600
loops=1)
|Total runtime: 35933.489 ms
|

and...

|
|select count(*) from iso_pjm_data_update_events
|
|count
|1593600
|

so it's not that there are any undeleted rows lying around

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Richard van den Berg 2005-04-22 09:04:17 Re: When are index scans used over seq scans?
Previous Message Dave Page 2005-04-22 08:08:01 Re: Joel's Performance Issues WAS : Opteron vs Xeon