byte-size of column values

From: Dominique Devienne <ddevienne(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: byte-size of column values
Date: 2022-10-18 15:51:26
Message-ID: CAFCRh-8gUgsZt8Yg7c_k2Es7wPPQwhqghGJO3V7azcSFfj+BRg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi. I'm surprised by the result for bit(3) and char, when calling
pg_column_size().

Why 6, instead of 1? The doc does mention 5-8 bytes overhead, but I
expected those for varying bit, not fixed-sized bit typed values. How
come?

Similarly, why 2 for char? Is it linked to Unicode?
1 byte for the varying UTF-8 encoded length, then the (potentially)
multi-byte encoding?
(the very bit pattern of UTF-8 allows to infer the encoded length, so
storing the length explicitly is theoretically not even necessary)

Similarly, enums are always 4 bytes I read, despite rarely exceeding
cardinalities beyond a single byte can store.

How does one store as compactedly as possible several small enums, on
millions of rows?

And is the minimum column size always 2?

I'm sure many we call out "premature optimization", but isn't using 32
bits instead of 2, 3 (or 8, to round to a byte) wasteful, in disk
space, thus then ultimately energy? (OK, that last one is pushing it
:) ).

I'm sure there are reasons for the above. And I guess I'm curious
about them. Thanks, --DD

ddevienne=> create table foo (b3 bit(3), i2 int2, i4 int4, i8 int8, c char);
CREATE TABLE
ddevienne=> insert into foo values (b'101', 1002, 1004, 1008, 'C');
INSERT 0 1
ddevienne=> select pg_column_size(b3), pg_column_size(i2),
pg_column_size(i4), pg_column_size(i8), pg_column_size(c) from foo;
pg_column_size | pg_column_size | pg_column_size | pg_column_size |
pg_column_size
----------------+----------------+----------------+----------------+----------------
6 | 2 | 4 | 8 |
2
(1 row)

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David G. Johnston 2022-10-18 16:04:11 Re: byte-size of column values
Previous Message Dean Rasheed 2022-10-18 10:18:25 Re: Exponentiation confusion