Skip site navigation (1) Skip section navigation (2)

Re: TOAST usage setting

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: TOAST usage setting
Date: 2007-05-29 14:04:12
Message-ID: 200705291404.l4TE4CX15194@momjian.us (view raw or flat)
Thread:
Lists: pgsql-hackers
Gregory Stark wrote:
> "Bruce Momjian" <bruce(at)momjian(dot)us> writes:
> 
> > I tested TOAST using a method similar to the above method against CVS
> > HEAD, with default shared_buffers = 32MB and no assert()s.  I created
> > backends with power-of-2 seetings for TOAST_TUPLES_PER_PAGE (4(default),
> > 8, 16, 32, 64) which gives TOAST/non-TOAST breakpoints of 2k(default),
> > 1k, 512, 256, and 128, roughly.
> >
> > The results are here:
> >
> > 	http://momjian.us/expire/TOAST/
> >
> > Strangely, 128 bytes seems to be the break-even point for TOAST and
> > non-TOAST, even for sequential scans of the entire heap touching all
> > long row values.  I am somewhat confused why TOAST has faster access
> > than inline heap data.
> 
> Did your test also imply setting the MAX_TOAST_CHUNK_SIZE (or however that's
> spelled)? And what size long values were you actually storing? How did you
> generate them?

Please look at the script sqltest.sh at that URL.  I did not modify
TOAST_MAX_CHUNK_SIZE, but it changes based on TOAST_TUPLES_PER_PAGE,
which I did change.

> I wonder if what's happening is that you have large chunks which when stored
> inline are leaving lots of dead space in the table. Ie, if you're generating
> values with size near 2k and the default chunk size you would expect to find
> an average of 1k dead space per page, or a 12.5% drain on performance. As you
> lower the chunk size you decrease that margin.

Well, that could be it, but effectively that is what would happen in the
real world too.

> However I agree that it's hard to believe that the costs of random access
> wouldn't swamp that 12.5% overhead pretty quickly.
> 
> One query I used when measuring the impact of the variable varlena stuff was
> this which gives the distribution of tuples/page over a table:
> 
> SELECT count(*),n 
>   FROM (SELECT count(*) AS n
>           FROM foo 
>          GROUP BY (point_in(tidout(ctid)))[0]
>        ) as x
>  GROUP BY n;
> 
> Which might help you peek at what's going on. You could also combine
> pg_column_size(foo.*) to measure the size of the tuple. I think that will
> measure the size of the tuple as is before the columns are detoasted.

Please use my test script and see what you find.

-- 
  Bruce Momjian  <bruce(at)momjian(dot)us>          http://momjian.us
  EnterpriseDB                               http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

In response to

Responses

pgsql-hackers by date

Next:From: Zdenek KotalaDate: 2007-05-29 14:14:34
Subject: Re: Users specific to a Database
Previous:From: Tom LaneDate: 2007-05-29 14:00:06
Subject: Re: What is the maximum encoding-conversion growth rate, anyway?

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group