Re: VARCHAR -vs- CHAR: huge performance difference?

From: "Scott Marlowe" <smarlowe(at)qwest(dot)net>
To: "C(dot) Bensend" <benny(at)bennyvision(dot)com>
Cc: pgsql-admin(at)postgresql(dot)org
Subject: Re: VARCHAR -vs- CHAR: huge performance difference?
Date: 2004-06-16 05:31:11
Message-ID: 1087363871.1582.20.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Tue, 2004-06-15 at 21:09, C. Bensend wrote:
> > Right. The only thing that will do for you is waste space with padding
> > blanks. The performance improvement you saw was due to something else,
> > which I theorize is not having to waste time reading dead space.
> >
> > Since you obviously doubt this assertion, try the same experiment except
> > load the data into a new table with the *same* structure as the original
> > (ie, all varchar). You'll see the same or probably better performance.
>
> Hi Tom,
>
> I don't doubt your assertion, I just don't understand it all yet. :)
> Let me assure you, you're the expert here, not I.
>
> I did as you suggested - I created a third copy of the table, using
> the exact same structure. And you're absolutely right - it was lightning
> fast (around 100ms).
>
> >> Actually, all databases on this server are vacuumed nightly, right
> >> before backups.
> >
> > Not often enough evidently...
>
> This statement worries me a bit. The data in the original table was
> bulk-loaded the other night (less than three days I'm sure), and new
> entries have been added at the approximate rate of 300 per day. Is this
> going to continue to happen?
>
> Or do I just need to vacuum more often? I _did_ try a vacuum before
> asking the list for help, but it didn't give any improvement (just a
> vacuum analyze).
>
> > Note the difference in "pages". emails is nearly double the physical
> > size, even though the live data in it is doubtless smaller. (You could
> > use the contrib/pgstattuple functions to see exactly how much live data
> > there is.)
>
> OK. I see (and understand) the pages value now.
>
> > I was actually expecting to see more than a 2:1 difference in file size,
> > seeing that you reported more than a 2:1 difference in read time. It
> > could be that there is also some question of physical layout of the file
> > on disk. The original table probably accreted over time and was given
> > space that's not very consecutive on disk. The copied table was
> > written in one swoop, more or less, and is very likely stored in a more
> > nearly consecutive set of physical disk blocks. I'm not sure how you'd
> > investigate this theory though --- AFAIK there isn't any real easy way
> > to find out how badly fragmented a file is in most Unix filesystems.
>
> Ugh. So, what would you recommend as a fix? I see the problem, and I
> see the fix that just worked, but I certainly can't be the only person
> around that is using a "wide" table with a lot of character data being
> added at a rather slow rate...

You might want to look into the autovacuum daemon, and / or increasing
fsm settings to be large enough to hold all the spare tuples released by
vacuuming.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Rajesh Kumar Mallah 2004-06-16 07:53:12 weired behavior... after pg_resetxlog-> dump->initdb-->reload.
Previous Message Adam Smith 2004-06-16 05:24:30 Re: Installation problem - mutex_lock/unlock or libpq.so ?