Re: Datatype sizes; a space and speed issue?

From: Joel Matthew <rees(at)ddcom(dot)co(dot)jp>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Datatype sizes; a space and speed issue?
Date: 2004-06-23 04:15:23
Message-ID: 20040623125823.EA33.REES@ddcom.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> What effect is there if I specify "TEXT" instead of
> say "VARCHAR(255)"?

Well, theoretically, in the latter case, the database will allocate 256
(257? 259?) bytes for that field in the record itself. That is, that
field will consume 256 bytes for each record stored.

In the former case, the database will (again, theoretically) allocate a
string pool (think of a separate file, as one way to do it) for the
field, and will allocate a pointer (4 or 8 bytes) into the string pool
for the field. This means that the field will consume only four or eight
bytes for each record stored, plus (don't forget this, it's important)
however many bytes are actually needed in the string pool for the text
value. If you have, for instance, a fifteen character name stored in
Unicode UTF32, there will be ninety, erm, octets (think bytes for most
CPUs) in the pool for that field of that record, plus a few pointers and
such so the system can tell where to collect garbage.

> How much benefit is there do being more aggresive
> and say cutting it in half again by using "VARCHAR(128)"?

Well, that would reduce the storage requirements for that field by half.
It might also prevent you from storing necessary information. That's
easily 128 characters if you're only using US-ASCII in UTF-8, but it's
only 32 characters of Unicode in UTF-32, and it could be anywhere
between 128 and 32 in Unicode UTF-8. Making a good prediction about the
tradeoff is one of the things a database engineer is paid for.

> I hope this isn't too basic a question!

Since you ask, yeah, it is, but unless things have changed here recently,
the people here aren't going to raise the oven temperature too high when
they roast me for answering it.

--
Joel Matthew <rees(at)ddcom(dot)co(dot)jp>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message postgresql 2004-06-23 04:19:10 Point in time recovery
Previous Message Vams 2004-06-23 04:00:58 Re: Datatype sizes; a space and speed issue?