Estimating hot data size

From: Chris Hoover <revoohc(at)gmail(dot)com>
To: PGSQL Performance <pgsql-performance(at)postgresql(dot)org>
Subject: Estimating hot data size
Date: 2011-02-16 20:51:36
Message-ID: AANLkTikxfB099wX3TtnXvKPoFLCcXtMCYpNvCSygpNPb@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

All,

I'm trying to estimate the size of my hot data set, and wanted to get some
validation that I'm doing this correctly.

Basically, I'm using the sum(heap_blks_read + idx_blks_read) from
pg_statio_all_tables, and diffing the numbers over a period of time (1 hour
at least). Is this a fair estimate? The reason for doing this is we are
looking at new server hardware, and I want to try and get enough ram on the
machine to keep the hot data in memory plus provide room for growth.

Thanks,

Chris

Example:

*Time*

*Total Blocks*

2011-02-16 11:25:34.621874-05

123,260,464,427.00

2011-02-16 12:25:46.486719-05

123,325,880,943.00

To get the hot data for this hour (in KB), I'm taking:

(123,325,880,943.00 - 123,260,464,427.00)* 8 = 523,332,128KB

Correct?

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tomas Vondra 2011-02-16 21:13:41 Re: Estimating hot data size
Previous Message Greg Smith 2011-02-16 20:36:01 Re: high user cpu, massive SELECTs, no io waiting problem