Re: 8192 BLCKSZ ?]

From: Jan Wieck <janwieck(at)Yahoo(dot)com>
To: Don Baccus <dhogaza(at)pacifier(dot)com>
Cc: andrew(at)modulus(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: 8192 BLCKSZ ?]
Date: 2000-12-04 23:01:46
Message-ID: 200012042301.SAA18298@jupiter.jw.home
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Don Baccus wrote:
>
> ...
> I expect TOAST to work even better). Users will still be able change to
> larger blocksizes (perhaps a wise thing to do if a large percentage of their
> data won't fit into a single PG block). Users using the default will
> be able to store rows of *awesome* length, efficiently.

Depends...

Actually the toaster already jumps in if your tuples exceed
BLKSZ/4, so with the default of 8K blocks it tries to keep
all tuples smaller than 2K. The reasons behind that are:

1. An average tuple size of 8K means an average of 4K unused
space at the end of each block. Wasting space means to
waste IO bandwidth.

2. Since big items are unlikely to be search criteria,
needing to read them into memory for every chech for a
match on other columns is a waste again. So the more big
items are off from the main tuple, the smaller the main
table becomes, the more likely it is that the main tuples
(holding the keys) are cached and the cheaper a
sequential scan becomes.

Of course, especially for 2. there is a break even point.
That is when the extra fetches to send toast values to the
client cost more than there was saved from not doing it
during the main scan already. A full table SELECT *
definitely costs more if TOAST is involved. But who does
unqualified SELECT * from a multi-gig table without problems
anyway? Usually you pick a single or a few based on some
other key attributes - don't you?

Let's make an example. You have a forum server that displays
one article plus the date and sender of all follow-ups. The
article bodies are usually big (1-10K). So you do a SELECT *
to fetch the actually displayed article, and another SELECT
sender, date_sent just to get the info for the follow-ups. If
we assume a uniform distribution of body size and an average
of 10 follow-ups, that'd mean that we save 52K of IO and
cache usage for each article displayed.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Adam Haberlach 2000-12-04 23:17:00 Re: Using Threads?
Previous Message Dan Lyke 2000-12-04 22:30:31 Re: Using Threads?