Re: TOAST usage setting

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: TOAST usage setting
Date: 2007-05-29 14:04:12
Message-ID: 200705291404.l4TE4CX15194@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Gregory Stark wrote:
> "Bruce Momjian" <bruce(at)momjian(dot)us> writes:
>
> > I tested TOAST using a method similar to the above method against CVS
> > HEAD, with default shared_buffers = 32MB and no assert()s. I created
> > backends with power-of-2 seetings for TOAST_TUPLES_PER_PAGE (4(default),
> > 8, 16, 32, 64) which gives TOAST/non-TOAST breakpoints of 2k(default),
> > 1k, 512, 256, and 128, roughly.
> >
> > The results are here:
> >
> > http://momjian.us/expire/TOAST/
> >
> > Strangely, 128 bytes seems to be the break-even point for TOAST and
> > non-TOAST, even for sequential scans of the entire heap touching all
> > long row values. I am somewhat confused why TOAST has faster access
> > than inline heap data.
>
> Did your test also imply setting the MAX_TOAST_CHUNK_SIZE (or however that's
> spelled)? And what size long values were you actually storing? How did you
> generate them?

Please look at the script sqltest.sh at that URL. I did not modify
TOAST_MAX_CHUNK_SIZE, but it changes based on TOAST_TUPLES_PER_PAGE,
which I did change.

> I wonder if what's happening is that you have large chunks which when stored
> inline are leaving lots of dead space in the table. Ie, if you're generating
> values with size near 2k and the default chunk size you would expect to find
> an average of 1k dead space per page, or a 12.5% drain on performance. As you
> lower the chunk size you decrease that margin.

Well, that could be it, but effectively that is what would happen in the
real world too.

> However I agree that it's hard to believe that the costs of random access
> wouldn't swamp that 12.5% overhead pretty quickly.
>
> One query I used when measuring the impact of the variable varlena stuff was
> this which gives the distribution of tuples/page over a table:
>
> SELECT count(*),n
> FROM (SELECT count(*) AS n
> FROM foo
> GROUP BY (point_in(tidout(ctid)))[0]
> ) as x
> GROUP BY n;
>
> Which might help you peek at what's going on. You could also combine
> pg_column_size(foo.*) to measure the size of the tuple. I think that will
> measure the size of the tuple as is before the columns are detoasted.

Please use my test script and see what you find.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zdenek Kotala 2007-05-29 14:14:34 Re: Users specific to a Database
Previous Message Tom Lane 2007-05-29 14:00:06 Re: What is the maximum encoding-conversion growth rate, anyway?