Skip site navigation (1) Skip section navigation (2)

How are text columns stored?

From: Meetesh Karia <meetesh(dot)karia(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: How are text columns stored?
Date: 2005-06-23 10:40:15
Message-ID: fc5b04ca050623034029b7e532@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
Hi all,

I'm running PG 8.0.3 on WinXP and I'm coming across some performance issues 
related to text columns. Basically, it appears as though PG is storing the 
text data inline with the rest of the row data as queries that don't touch 
the text column are slower when there is data in the text column than when 
there isn't. According to section 8.3 of the doc:

"Long values are also stored in background tables so they do not interfere 
with rapid access to the shorter column values."

So, how long does a value have to be to be considered "long"?

If necessary, here is some more specific information about what I'm doing:

1) I create a new table and use 'COPY FROM' to populate it. When the data in 
the text column is limited to a max of 60 characters, this part takes 2-3 
minutes less than when the data is at its full size. The table will be 
populated with ~750k rows. Here's an example of the table I create (no, I 
didn't name the fields "vc_field1", "vc_field2", etc ;) ):

create table my_table_import
(
vc_field1 varchar(255),
vc_field2 varchar(255),
vc_field3 varchar(255),
f_field1 float8,
text_field1 text,
ts_field1 timestamp,
v_field4 varchar(255),
i_field1 int8,
i_field2 int8
);

2) I populate i_field1 and i_field2 from lookup tables. This step takes 
about 7 mins longer with the full text data than with the limited data.

update my_table_import
set i_field1 = f.i_field1,
i_field2 = u.i_field2
from lookup1 as f, lookup2 as u
where vc_field2 = f.vc_field2
and vc_field1 = u.vc_field1;

3) I then create an index on this table and run a couple of queries on it. 
Each of these queries takes about 10 minutes longer with the full text data 
then without it. Here's the index that I create and an example of one of the 
queries that I run:

create index idx_my_table_import_i1_i2 on my_table_import (i_field1, 
i_field2);
analyze my_table_import;

select i_field1, i_field2, max(ts_field1) as ts_field1, count(*) as 
dup_count
from my_table_import
where i_field1 between 0 and 9999
group by i_field1, i_field2

Thanks for the help,
Meetesh Karia

Responses

pgsql-performance by date

Next:From: Keith WorthingtonDate: 2005-06-23 12:16:22
Subject: Re: Performance Tuning Article
Previous:From: Chun Yit(Chronos)Date: 2005-06-23 09:56:52
Subject: Postgresql7.4.5 running slow on plpgsql function

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group