Re: Case Insensitive Queries

From: Ian Lance Taylor <ian(at)airs(dot)com>
To: Mark <mark(at)zserve(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-sql(at)postgresql(dot)org
Subject: Re: Case Insensitive Queries
Date: 2001-05-30 19:51:03
Message-ID: si8zjey5so.fsf@daffy.airs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Mark <mark(at)zserve(dot)com> writes:

> with varchars, as I understand it (and postgresql may be different),
> each varchar field has a header that stores the length of the particular
> entry's length. Further, if the varchar field precedes another field,
> the system loses the ability to use fixed-length addressing to access
> the field after the varchar, since the system must determine on a
> case-by-case basis how to access the field after the varchar. It has to
> calculate the size of the varchar, add that to the start of the varchar
> (plus header length), and then it has the address of the next field.
> With non-variant char it is fixed length, so selects and updates operate
> much more quickly. Even the postgresql documentation asserts something
> similar to this:
>
> 'Both TEXT and VARCHAR() store only the number of characters in the
> string. CHAR(length) is similar to VARCHAR(), except it always stores
> exactly length characters. This type pads the value with trailing spaces
> to achieve the specified length, and provides slightly faster access
> than TEXT or VARCHAR().'
>
> Perhaps I am misinformed.

You are misinformed with respect to Postgres. The implementation of
CHAR(length) is just like the implementation of VARCHAR(). It stores
a header with the entry length. That header is considered in all
functions in the same way that the VARCHAR() header is.

I don't know what documentation you are quoting, but it seems somewhat
misleading to me.

Here is the comment from the implementation
(src/backend/utils/adt/varchar.c):

/*
* CHAR() and VARCHAR() types are part of the ANSI SQL standard. CHAR()
* is for blank-padded string whose length is specified in CREATE TABLE.
* VARCHAR is for storing string whose length is at most the length specified
* at CREATE TABLE time.
*
* It's hard to implement these types because we cannot figure out
* the length of the type from the type itself. I change (hopefully all) the
* fmgr calls that invoke input functions of a data type to supply the
* length also. (eg. in INSERTs, we have the tupleDescriptor which contains
* the length of the attributes and hence the exact length of the char() or
* varchar(). We pass this to bpcharin() or varcharin().) In the case where
* we cannot determine the length, we pass in -1 instead and the input string
* must be null-terminated.
*
* We actually implement this as a varlena so that we don't have to pass in
* the length for the comparison functions. (The difference between these
* types and "text" is that we truncate and possibly blank-pad the string
* at insertion time.)
*
* - ay 6/95
*/

Ian

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Chris Ruprecht 2001-05-30 20:14:17 Unique record Identifier?
Previous Message Stephan Szabo 2001-05-30 18:54:54 Re: Case Insensitive Queries