Re: Netflix Prize data

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Greg Sabino Mullane" <greg(at)turnstep(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Netflix Prize data
Date: 2006-10-04 23:36:09
Message-ID: 87hcyj64d2.fsf@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


"Greg Sabino Mullane" <greg(at)turnstep(dot)com> writes:

> CREATE TABLE rating (
> movie SMALLINT NOT NULL,
> person INTEGER NOT NULL,
> rating SMALLINT NOT NULL,
> viewed DATE NOT NULL
> );

You would probably be better off putting the two smallints first followed by
the integer and date. Otherwise both the integer and the date field will have
an extra two bytes of padding wasting 4 bytes of space.

If you reorder the fields that way you'll be down to 28 bytes of tuple header
overhead and 12 bytes of data. There's actually another 4 bytes in the form of
the line pointer so a total of 44 bytes per record. Ie, almost 73% of the disk
i/o you're seeing is actually per-record overhead.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2006-10-04 23:39:58 Re: [HACKERS] Updated version of FAQ_Solaris
Previous Message Mark Woodward 2006-10-04 22:57:58 Re: Netflix Prize data