|From:||Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>|
|Subject:||Reduce maximum error in tuples estimation after vacuum.|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
Postgresql estimates the number of live tuples after the vacuum
has left some buffers unscanned. This estimation does well for
most cases, but makes completely different result with a strong
imbalance of tuple density.
create table t (a int, b int);
insert into t (select a, (random() * 100000)::int from generate_series((select count(*) from t) + 1, 1000000) a);
update t set b = b + 1 where a < (select count(*) from t) * 0.7;
delete from t where a < (select count(*) from t) * 0.99;
After this, pg_stat_user_tables.n_live_tup shows 417670 which is
41 times larger than the real number of rows 100001. And what
makes it worse, autovacuum nor autoanalyze won't run until
n_dead_tup goes above 8 times larger than the real number of
tuples in the table for the default settings..
| postgres=# select n_live_tup, n_dead_tup
| from pg_stat_user_tables where relname='t';
| n_live_tup | n_dead_tup
| 417670 | 0
| postgres=# select reltuples from pg_class where relname='t';
| postgres=# select count(*) from t;
Using n_dead_tup before vacuuming seems to make it better but I
heard that the plan is abandoned from some reason I don't know.
So I've come up with the another plan - using FSM to estimate the
tuple density in unscanned pages. The point is that make
estimation reliying on the uniformity of tuple length instead of
tuple density. This change seems keeping that errors under a few
times of tuples. Additional page reads for FSM are about 4000th
(SlotsPerFSMPage) of the skipped pages, and I suppose this is
tolerable during vacuum.
Overall algorithm could be illistrated as below,
- summing up used bytes, max offnum(PageGetMaxOffsetNumber),
maximum free bytes for tuple data , and free bytes after page
vacuum through all scanned pages.
- summing up free bytes informed by FSM through all skipped
- Calculate mean tuple length from the overall used bytes and
sum of max offnums, and scanned pages.
- Guess tuple density in skipped pages using overall free bytes
from FSM and the mean tuple length calculated above.
- Finally, feed estimated number of the live tuples BEFORE
vacuum into vac_estimate_reltuples.
Of course this method affected by the imbalance of tuple LENGTH,
but it also seems to be kept within a few times of the number of
for rows with invariable length, the test for head shows, where
"tups est" is pg_class.reltuples and "tups real" is count(*).
del% | pages | n_live_tup | tups est | tups real | est/real | bufs
0.9 | 4425 | 100001 | 470626 | 100001 | 4.706 | 3985
0.95 | 4425 | 50001 | 441196 | 50001 | 8.824 | 4206
0.99 | 4425 | 417670 | 417670 | 10001 | 41.763 | 4383
and with the patch
0.9 | 4425 | 106169 | 106169 | 100001 | 1.062 | 3985
0.95 | 4425 | 56373 | 56373 | 50001 | 1.127 | 4206
0.99 | 4425 | 10001 | 16535 | 10001 | 1.653 | 4383
What do you think about this?
The attached files are:
- vacuum_est_improve_20130614.patch: the patch for this proposal
- vactest.sql: sql script to cause the sitiation
- vactest.sh: test script to find the errors relating this patch.
- test_result.txt: all of the test result for various deletion
ratio which the test script above yields.
NTT Open Source Software Center
|Next Message||Kyotaro HORIGUCHI||2013-06-14 08:44:15||Add visibility map information to pg_freespace.|
|Previous Message||Cédric Villemain||2013-06-14 07:46:53||Re: [PATCH] Add transforms feature|