Re: 1 char in the world

From: Hannu Krosing <hannu(at)tm(dot)ee>
To: Matt Mello <alien(at)spaceship(dot)com>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Re: 1 char in the world
Date: 2003-01-29 12:18:20
Message-ID: 1043842700.5008.11.camel@huli
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, 2003-01-29 at 06:22, Matt Mello wrote:
> TEXT vs "char" ... vs BOOLEAN
>
> I am porting from Informix to PG. In doing so, I had to pick some data
> types for fields, and began wondering about the performance of char/text
> fields with one character. For example, I have a field which has one of
> the following values/states: {'A', 'D', 'F', 'U'}. Since CHAR(n),
> VARCHAR, and TEXT are all supposed to have the same performance
> according to the docs, it seems that they will all perform the same.
> For this reason, I did not squabble over which one of these to use.
> However, since "char" is implemented differently, I thought I would
> compare it to one of the others. I chose to pit TEXT against "char".
>
> Test query = explain analyze select count(*) from table where onechar='D';
> Table size = 512 wide [mostly TEXT] * 400000 rows
> Performance averages:
> "char" 44ms
> TEXT 63ms
>
> This seems somewhat reasonable, and makes me want to use "char" for my
> single-char field. Does everyone else find this to be reasonable? Is
> this pretty much the behavior I can expect on extraordinarily large
> tables, too?

The actual compares will likely stay faster for char than for text.

OTOH the actual storage of one-char datatype should not play so
significant role for very large tables, even if this is the only field
in that table, as most of the overhead will be in other places - storage
overhead in page/tuple headers, performance in retrieving the
pages/tuples and cache lookups, etc.

Also, for very big tables you will most likely want to restrict selects
on other criteria than a 4-valued field, so that indexes could be used
in retrieving data.

> And, should I worry about things like the backend
> developers removing "char" as a type later?
>
> --
>
> This naturally led me to another question. How do TEXT, "char", and
> BOOLEAN compare for storing t/f values. The test results I saw were
> surprising.
>
> Test query=
> "char"/TEXT: explain analyze select count(*) from table where bool='Y';

You could also try just

select count(*) from table where bool;

> boolean: explain analyze select count(*) from table where bool=true;
> Table size (see above)
> Performance averages:
> TEXT 24ms
> BOOLEAN 28ms
> "char" 17ms
>
> Why does boolean rate closer to TEXT than "char"? I would think that
> BOOLEANs would actually be stored like "char"s to prevent using the
> extra 4 bytes with TEXT types.
>
> Based on these results, I will probably store my booleans as "char"
> instead of boolean. I don't use stored procedures with my application
> server, so I should never need my booleans to be the BOOLEAN type. I
> can convert faster in my own code.
>
> --
>
> NOTE: the above tests all had the same relative data in the different
> fields (what was in TEXT could be found in "char", etc.) and were all
> indexed equally.

Did you repeat the texts enough times to be sure that you get reliable
results ?

>
> Thanks!
--
Hannu Krosing <hannu(at)tm(dot)ee>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2003-01-29 15:56:27 Re: 1 char in the world
Previous Message Justin Clift 2003-01-29 10:20:00 Re: 1 char in the world