Quick Links

Estimating hot data size

From:	Chris Hoover <revoohc(at)gmail(dot)com>
To:	PGSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject:	Estimating hot data size
Date:	2011-02-16 20:51:36
Message-ID:	AANLkTikxfB099wX3TtnXvKPoFLCcXtMCYpNvCSygpNPb@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

All,

I'm trying to estimate the size of my hot data set, and wanted to get some
validation that I'm doing this correctly.

Basically, I'm using the sum(heap_blks_read + idx_blks_read) from
pg_statio_all_tables, and diffing the numbers over a period of time (1 hour
at least). Is this a fair estimate? The reason for doing this is we are
looking at new server hardware, and I want to try and get enough ram on the
machine to keep the hot data in memory plus provide room for growth.

Thanks,

Chris

Example:

*Time*

*Total Blocks*

2011-02-16 11:25:34.621874-05

123,260,464,427.00

2011-02-16 12:25:46.486719-05

123,325,880,943.00

To get the hot data for this hour (in KB), I'm taking:

(123,325,880,943.00 - 123,260,464,427.00)* 8 = 523,332,128KB

Correct?

Responses

Re: Estimating hot data size at 2011-02-16 21:13:41 from Tomas Vondra
Re: Estimating hot data size at 2011-02-16 22:02:40 from Greg Smith

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Tomas Vondra	2011-02-16 21:13:41	Re: Estimating hot data size
Previous Message	Greg Smith	2011-02-16 20:36:01	Re: high user cpu, massive SELECTs, no io waiting problem