Re: Estimating space required for indexes

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Estimating space required for indexes
Date: 2003-04-28 14:10:21
Message-ID: 18084.1051539021@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Shridhar Daithankar <shridhar_daithankar(at)persistent(dot)co(dot)in> writes:
> Yesterday I was trying to upload a medium size terrain data dump to a
> postgresql database. (350 files, of sizes between 8-13MB each.. 3.3GB all in
> all). The load was done in 45 minutes.(7.3.2 is real fast for data loading is
> my feeling, anyway..). It took 4GB of disk space after upload.

> A tuple consists of 3 float values, x,y,z. I had to create a compound index on
> x and y. I started indexing it and killed it 1.5 hours later as it filled
> rest of the 5GB free disk upto point of choking.

AFAIK, a CREATE INDEX should require no more than twice the finished
index's size on disk. I'm surprised that you were able to build the
index one way and not the other.

> How can I predict reasonably how much disk space I am going to need for such
> kind of indexing operation?

Assuming your "float"s were float4s, the heap tuple size is

28 bytes overhead + 3 * 4 bytes data = 40 bytes/row

(assuming WITHOUT OIDS, no nulls, Intel-like alignment rules) while the
index tuple size is

12 bytes overhead + 2 * 4 bytes data = 20 bytes/row

But this is not the whole story because heap pages are normally crammed
full while btree index pages are normally only filled 2/3rds full during
initial creation. (Plus you have to allow for upper b-tree levels, but
with such small index entries that won't be much.) So I'd have expected
the finished index to be about 3/4ths the size of the table proper.
I'm surprised you could fit it at all.

> This data is just a small sample of things and
> more data is coming.

Better buy more disk ...

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Jeremiah Jahn 2003-04-28 14:13:43 Re: > 16TB worth of data question
Previous Message Andrew Sullivan 2003-04-28 11:25:40 Re: Solaris