Table 8-4. Character Types
|character varying(n), varchar(n)||variable-length with limit|
|character(n), char(n)||fixed-length, blank padded|
|text||variable unlimited length|
Table 8-4 shows the general-purpose character types available in PostgreSQL.
SQL defines two primary character types: character varying(n) and character(n), where n is a positive integer. Both of these types can store strings up to n characters in length. An attempt to store a longer string into a column of these types will result in an error, unless the excess characters are all spaces, in which case the string will be truncated to the maximum length. (This somewhat bizarre exception is required by the SQL standard.) If the string to be stored is shorter than the declared length, values of type character will be space-padded; values of type character varying will simply store the shorter string.
If one explicitly casts a value to character varying(n) or character(n), then an over-length value will be truncated to n characters without raising an error. (This too is required by the SQL standard.)
The notations varchar(n) and char(n) are aliases for character varying(n) and character(n), respectively. character without length specifier is equivalent to character(1). If character varying is used without length specifier, the type accepts strings of any size. The latter is a PostgreSQL extension.
In addition, PostgreSQL provides the text type, which stores strings of any length. Although the type text is not in the SQL standard, several other SQL database management systems have it as well.
Values of type character are physically padded with spaces to the specified width n, and are stored and displayed that way. However, the padding spaces are treated as semantically insignificant. Trailing spaces are disregarded when comparing two values of type character, and they will be removed when converting a character value to one of the other string types. Note that trailing spaces are semantically significant in character varying and text values.
The storage requirement for data of these types is 4 bytes plus the actual string, and in case of character plus the padding. Long strings are compressed by the system automatically, so the physical requirement on disk may be less. Long values are also stored in background tables so they do not interfere with rapid access to the shorter column values. In any case, the longest possible character string that can be stored is about 1 GB. (The maximum value that will be allowed for n in the data type declaration is less than that. It wouldn't be very useful to change this because with multibyte character encodings the number of characters and bytes can be quite different anyway. If you desire to store long strings with no specific upper limit, use text or character varying without a length specifier, rather than making up an arbitrary length limit.)
Tip: There are no performance differences between these three types, apart from the increased storage size when using the blank-padded type. While character(n) has performance advantages in some other database systems, it has no such advantages in PostgreSQL. In most situations text or character varying should be used instead.
Refer to Section 188.8.131.52 for information about the syntax of string literals, and to Chapter 9 for information about available operators and functions. The database character set determines the character set used to store textual values; for more information on character set support, refer to Section 21.2.
Example 8-1. Using the character types
CREATE TABLE test1 (a character(4)); INSERT INTO test1 VALUES ('ok'); SELECT a, char_length(a) FROM test1; -- (1) a | char_length ------+------------- ok | 2 CREATE TABLE test2 (b varchar(5)); INSERT INTO test2 VALUES ('ok'); INSERT INTO test2 VALUES ('good '); INSERT INTO test2 VALUES ('too long'); ERROR: value too long for type character varying(5) INSERT INTO test2 VALUES ('too long'::varchar(5)); -- explicit truncation SELECT b, char_length(b) FROM test2; b | char_length -------+------------- ok | 2 good | 5 too l | 5
There are two other fixed-length character types in PostgreSQL, shown in Table 8-5. The name type exists only for storage of identifiers in the internal system catalogs and is not intended for use by the general user. Its length is currently defined as 64 bytes (63 usable characters plus terminator) but should be referenced using the constant NAMEDATALEN. The length is set at compile time (and is therefore adjustable for special uses); the default maximum length may change in a future release. The type "char" (note the quotes) is different from char(1) in that it only uses one byte of storage. It is internally used in the system catalogs as a poor-man's enumeration type.