Call closed: Almost happy ending (from "Data files became huge with no apparent reason" thread)

From: Dario Fumagalli <dfumagalli(at)tin(dot)it>
To: pgsql-general(at)postgresql(dot)org
Subject: Call closed: Almost happy ending (from "Data files became huge with no apparent reason" thread)
Date: 2002-09-03 14:08:18
Message-ID: 3D74C252.2080609@tin.it
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Martijn van Oosterhout wrote:

> On Tue, Sep 03, 2002 at 12:39:01PM +0200, Dario Fumagalli wrote:
>
>>Step by step procedure:
>>1) I stopped the postmaster and made sure no process was still on (ps aux)
>>2) As postgres user I started postgres -O -P -D path etc.
>>3) A prompt appeared. Here I typed reindex proj_store force (I'm trying
>>to remember it since I'm not in the company the server is in).
>>4) Messages appeared stating indexes were being rebuilt (no errors shown).
>>5) I hit CTRL-D. Postgres exited.
>>6) I restarted the postmaster.
>>
>
> Should work but seems excessive.
>
>
>> pg_operator_oprname_l_r_k_index | 1092613 | 0 | 0 | i | 7
>> pg_proc | 1255 | 17231 | 17243 | r | 28
>> pg_proc_oid_index | 17166 | 0 | 0 | i | 5
>> pg_proc_proname_narg_type_index | 17169 | 0 | 0 | i | 20
>>
>
> Your list seems truncated?
>

True (I think due to some clipboard blues with telnet). Here there are
the remaining part:

pg_relcheck | 1216 | 17246 | 17258 | r | 0
pg_relcheck_rcrelid_index | 1092625 | 0 | 0 | i | 1
pg_rewrite | 17058 | 17261 | 17273 | r | 1
pg_rewrite_oid_index | 1092626 | 0 | 0 | i | 2
pg_rewrite_rulename_index | 1092627 | 0 | 0 | i | 2
pg_rules | 17309 | 0 | 0 | v | 10
pg_shadow | 1260 | 0 | 0 | r | 1
pg_statistic | 16600 | 17276 | 17288 | r | 4
pg_statistic_relid_att_index | 1092628 | 0 | 0 | i | 4
pg_tables | 17335 | 0 | 0 | v | 10
pg_toast_1215 | 17201 | 0 | 0 | t | 0
pg_toast_1215_idx | 17213 | 0 | 0 | i | 1
pg_toast_1216 | 17246 | 0 | 0 | t | 0
pg_toast_1216_idx | 17258 | 0 | 0 | i | 1
pg_toast_1255 | 17231 | 0 | 0 | t | 0
pg_toast_1255_idx | 17243 | 0 | 0 | i | 1
pg_toast_16600 | 17276 | 0 | 0 | t | 1
pg_toast_16600_idx | 17288 | 0 | 0 | i | 2
pg_toast_17058 | 17261 | 0 | 0 | t | 0
pg_toast_17058_idx | 17273 | 0 | 0 | i | 1
pg_toast_17086 | 17216 | 0 | 0 | t | 0
pg_toast_17086_idx | 17228 | 0 | 0 | i | 1
pg_toast_258417 | 258525 | 0 | 0 | t | 0
pg_toast_258417_idx | 258537 | 0 | 0 | i | 1
pg_trigger | 1219 | 0 | 0 | r | 1
pg_trigger_tgconstrname_index | 1092606 | 0 | 0 | i | 2
pg_trigger_tgconstrrelid_index | 1092607 | 0 | 0 | i | 2
pg_trigger_tgrelid_index | 1092608 | 0 | 0 | i | 2
pg_type | 1247 | 0 | 0 | r | 3
pg_type_oid_index | 17193 | 0 | 0 | i | 2
pg_type_typname_index | 17196 | 0 | 0 | i | 2
pg_user | 17291 | 0 | 0 | v | 10
pg_variable | 1264 | 0 | 0 | s | 0
pg_views | 17322 | 0 | 0 | v | 10
pg_xactlock | 0 | 0 | 0 | s | 0
products | 258417 | 258525 | 258537 | r | 66
products_pkey | 258540 | 0 | 0 | i | 1653
products_seq | 258398 | 0 | 0 | S | 10
video_connectors | 258277 | 0 | 0 | r | 0
video_connectors_pkey | 258290 | 0 | 0 | i | 1
video_connectors_seq | 258258 | 0 | 0 | S | 10
video_resolutions | 258204 | 0 | 0 | r | 1
video_resolutions_pkey | 258220 | 0 | 0 | i | 2
video_resolutions_seq | 258185 | 0 | 0 | S | 10
video_standards | 258242 | 0 | 0 | r | 0
video_standards_pkey | 258255 | 0 | 0 | i | 1
video_standards_seq | 258223 | 0 | 0 | S | 10

>
>>As you may see, *_pkey primary keys are BIG. They are the only ones I
>>didn't dare to drop and re-create.
>>
>>Ex.
>>
>> products_pkey | 258540 | 0 | 0 | i | 1653
>>
>>where the entire table takes 66 pages, or, worse
>>
>> det_prod_dep_consumpt_pkey | 258764 | 0 | 0 | i | 3286
>>
>>where the entire table takes 21 pages.
>>
>
> What happens if you say "REINDEX INDEX products_pkey" or "REINDEX INDEX
> det_prod_dep_consumpt_pkey". Do those numbers change?
>

Yes, it did it!

proj_store=# REINDEX INDEX products_pkey;
REINDEX
proj_store=# REINDEX INDEX det_prod_dep_consumpt_pkey
REINDEX
proj_store=# select relname, relfilenode as node, reltoastrelid as relid,
proj_store=# reltoastidxid as idxid, relkind as k, relpages as pag from
pg_class order by relname;

[Snip]
relname | node | relid | idxid | k | pag
--------------------------------+---------+--------+--------+---+------
det_prod_dep_consumpt_pkey | 1224634 | 0 | 0 | i | 6
[Snip]
products_pkey | 1224633 | 0 | 0 | i | 5

BUT... there is a but...

The du -h command says that, despite the reported index reduction, the
overall database size has increased to 105 MB (?)!.

And this raises a question: why a reindex proj_store force in single
user mode did not packed the indexes while from psql / multiuser
postmaster it did? Bah!

> What is the output of "VACUUM VERBOSE ANALYSE products".
>

proj_store=# VACUUM VERBOSE ANALYSE products;
proj_store-#
NOTICE: --Relation products--
NOTICE: Pages 6142: Changed 66, reaped 6076, Empty 0, New 0; Tup 976:
Vac 90768
, Keep/VTL 0/0, Crash 0, UnUsed 3, MinLen 510, MaxLen 622; Re-using:
Free/Avail.
Space 49289988/49289988; EndEmpty/Avail. Pages 0/6076. CPU 0.25s/0.05u
sec.
NOTICE: Index products_pkey: Pages 5; Tuples 976: Deleted 0. CPU
0.01s/0.00u se
c.
NOTICE: Index idx_products_csc: Pages 832; Tuples 976: Deleted 90768.
CPU 0.59s
/1.12u sec.
NOTICE: Index idx_products_b: Pages 729; Tuples 976: Deleted 90768. CPU
0.38s/0
.93u sec.
NOTICE: Index idx_products_b2: Pages 275; Tuples 976: Deleted 90768.
CPU 0.29s/
0.94u sec.
NOTICE: Index idx_products_v: Pages 265; Tuples 976: Deleted 90768. CPU
0.20s/0
.95u sec.
NOTICE: Rel products: Pages: 6142 --> 66; Tuple(s) moved: 976. CPU
5.05s/1.67u
sec.
NOTICE: Index products_pkey: Pages 7; Tuples 976: Deleted 976. CPU
0.00s/0.02u
sec.
NOTICE: Index idx_products_csc: Pages 832; Tuples 976: Deleted 976. CPU
0.55s/0
.23u sec.
NOTICE: Index idx_products_b: Pages 730; Tuples 976: Deleted 976. CPU
0.16s/0.0
2u sec.
NOTICE: Index idx_products_b2: Pages 278; Tuples 976: Deleted 976. CPU
0.06s/0.
01u sec.
NOTICE: Index idx_products_v: Pages 267; Tuples 976: Deleted 976. CPU
0.05s/0.0
0u sec.
NOTICE: --Relation pg_toast_258417--
NOTICE: Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0,
Keep/VTL 0/
0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space
0/0; EndEm
pty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE: Index pg_toast_258417_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
NOTICE: Analyzing...
VACUUM
proj_store=#

>
>>And "not easily"? I may make backups, go standalone, and do whatever
>>evil you may think ;)
>>BTW I have the scripts to re-create indexes or constraints and know how
>>to hack them, since I'm the programmer in charge for all (sql, programs,
>>db administration) for that company.
>>
>
> REINDEX should do it. Please provide the output of the vacuum command.

I finally made a VACUUM VERBOSE ANALYZE.

It did things (lots of deletions and some moves) on the affected tables.
But now, if I reissue the pg_class query, I get figures like the following:
relname | node | relid | idxid | k | pag
---------------------------------+---------+--------+--------+---+------
idx_det_prod_comp_vid_connsp | 1094065 | 0 | 0 | i | 1
idx_det_prod_dep_consumptdp | 1094068 | 0 | 0 | i | 479
idx_det_prod_vid_connsp | 1094059 | 0 | 0 | i | 1
idx_det_prod_vid_resp | 1094053 | 0 | 0 | i | 1
idx_det_prod_vid_stdsp | 1094056 | 0 | 0 | i | 1
idx_det_turnover_c | 1094095 | 0 | 0 | i | 4
idx_det_turnover_t | 1094098 | 0 | 0 | i | 4
idx_products_b | 1094044 | 0 | 0 | i | 733
idx_products_b2 | 1094047 | 0 | 0 | i | 281
idx_products_csc | 1094041 | 0 | 0 | i | 832
idx_products_v | 1094050 | 0 | 0 | i | 270

So, new big index file have born!
After this vacuum, du -h reports:

[postgres(at)web base]$ du -h
1.6M ./1
1.5M ./18719
24M ./242014
11M ./46821
1.7M ./197097
2.3M ./279236
43M

A lower disk usage than after the reindex above (still more than the
expected 26MB). The development machine (with 3 weeks old outdated data,
unfortunately, but still with the same record figures (+-5%)):

[postgres(at)web base]$ du -h
1.6M ./1
1.5M ./18719
6.3M ./250600
11M ./46821
1.7M ./197097
2.2M ./259865
25M

>
>>It would be acceptable, instead, to have a database that is never
>>optimized at 100% (i.e. it has "gaps" because of non full vacuums that
>>add for about another almost FIXED 30%), but that NEVER, NEVER grows out
>>of control. We may prepare a db maintenance plan that includes a full
>>vacuum every 6 months.
>>
>
> Firstly, for 7.2 vacuum doesn't lock any tables, so you can run it as often
> as you like. And why only every six months? Just do it daily. If your tables
> are so small, it should take seconds to vacuum to whole database. The

This question is about a different database server for another (very
big) company. Here records are not hundreds, they are 10 millions up.
And a vacuum I fear will last for more that some seconds.

> default debian setup runs vacuum daily, as is recommended in the docs.

I hoped to be able to use a Debian. I had to mediate between Open Source
(Debian, the preferred for us the developers) and a closed, commercial
solution (loved by management... sigh). The solution was a Caldera... a
Linux solution but with some commercial bells and whistles on it to make
everyone happy. But I don't like it as Debian for programming purposes.
It is too Red-Hattish :) and comes with ancient tools and libraries.

>
> Hope this helps,
>

IT HELPED A LOT!!!
In fact using your suggestion and some bricolage :) I solved all the
problems!

Steps to reproduce it:

- Given that single user REINDEX [database name] FORCE in single user
modedid rebuild the indexes but did not optimize them;
- Given that a REINDEX INDEX [index name] did optimize and shrink a
single index but increased the overall data size;
- Given that a VACUUM VERBOSE ANALYZE somehow alters the overall
database files size (seems to re-distribute the file space evenly across
indexes);
- Given that I was at a loss and ready to do anything with a logical
sense (or not);

1) I reindexed all excessively big indexes one by one in psql:

REINDEX INDEX dep_names_pkey;
REINDEX INDEX det_prod_dep_consumpt_pkey;
REINDEX INDEX idx_det_prod_dep_consumptdp;
REINDEX INDEX idx_products_b;
REINDEX INDEX idx_products_b2;
REINDEX INDEX idx_products_csc;
REINDEX INDEX idx_products_v;

So there was no surplus pages anywhere.

2) I performed a VACUUM VERBOSE ANALYZE, that having nothing to
re-distribute, simply "ate" all the wasted space.

I know this is probably the weirdest and uninformed / untechnical
"reasoning" you saw in this list from years, but it has one small,
almost unnoticeable advantage: IT WORKED AND NOW THE DATABASE IS ITS
ORIGINAL SIZE AGAIN (even less).

Output of the du -h command of the production machine:

[postgres(at)web base]$ du -h
1.6M ./1
1.5M ./18719
4.0M ./242014
11M ./46821
1.7M ./197097
2.3M ./279236
22M

The machine is open to users again an is as fast as it was, making again
this LAPP (Linux + Apache + PostgreSQL + PHP 4) intranet / extranet
server the envy of the others bigger but bloated JSP / ASP / Commercial
solutions that sit near him.

Thanks again all for the excellent responses. They really helped a lot.

My best regards,
Dario Fumagalli

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Martijn van Oosterhout 2002-09-03 14:23:51 Re: Call closed: Almost happy ending (from "Data files became huge with no apparent reason" thread)
Previous Message Martijn van Oosterhout 2002-09-03 14:04:20 Re: Wanted to code: pgdiff ($$$)