Re: Reduce heap tuple header size

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Manfred Koizar <mkoi-pg(at)aon(dot)at>, pgsql-patches(at)postgresql(dot)org
Subject: Re: Reduce heap tuple header size
Date: 2002-06-25 20:15:45
Message-ID: 3D18CF71.99D97F74@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Bruce Momjian wrote:
>
> Jan, any update on this? Are the numbers correct?

Sorry, but took some time.

Well, it turned out that pgbench does a terrible job with runtimes below
30 minutes. Seems that one checkpoint more or less can have a
significant impact on the numbers reported by such run.

Also starting off with a populated cache (rampup) seems to be a very
good idea. So my advice for running pgbench is to do an initdb before
running (to whipe out logfile creation/reuse issues). To populate a
fresh database with a reasonable scaling. Run pgbench with a high enough
-c and -t so it runs for at least 5 minutes. Then do the actual
measurement with a pgbench run with settings keeping the system busy for
30 minutes or more. Needless to say, keep your fingers (and everyone
elses too) off the system during that time. Shut down not needed
services, and especially cron!

Using the above, the discussed change to the tuple header shows less
than 1% difference.

Sorry for all the confusion.
Jan

>
> ---------------------------------------------------------------------------
>
> Jan Wieck wrote:
> > Bruce Momjian wrote:
> > >
> > > Jan Wieck wrote:
> > > >
> > > > Did someone run at least pgbench with/without that patch applied?
> > >
> > > No, but he did perform this analysis:
> > >
> > > > thus reducing the additional cost to one t_infomask compare,
> > > > because the Satisfies functions only access Cmin and Cmax,
> > > > when HEAP_MOVED is known to be not set.
> > > >
> > > > OTOH experimenting with a moderatly sized "out of production"
> > > > database I got the following results:
> > > > | pages | pages |
> > > > relkind | count | tuples | before| after | savings
> > > > --------+-------+--------+-------+-------+--------
> > > > i | 31 | 808146 | 8330 | 8330 | 0.00%
> > > > r | 32 | 612968 | 13572 | 13184 | 2.86%
> > > > all | 63 | | 21902 | 21514 | 1.77%
> > > >
> > > > 2.86% fewer heap pages mean 2.86% less disk IO caused by heap pages.
> > > > Considering that index pages tend to benefit more from caching
> > > > we conclude that heap pages contribute more to the overall
> > > > IO load, so the total savings in the number of disk IOs should
> > > > be better than the 1.77% shown in the table above. I think
> > > > this outweighs a few CPU cycles now and then.
> >
> > This anawhat? This is a proof that this patch is able to save not even
> > 3% of disk space in a production environment plus an assumption that the
> > saved IO outweights the extra effort in the tuple visibility checks.
> >
> > Here are some numbers:
> >
> > P3 850MHz 256MB RAM IDE
> > postmaster -N256 -B8192
> > pgbench -i -s 10 db
> > pgbench -c 20 -t 500 db
> >
> >
> > Current CVS tip: tps 34.1, 38.7, 36.6
> > avg(tps) 36.4
> >
> > With patch: tps 37.0, 41.1, 41.1
> > avg(tps) 39.7
> >
> > So it saves less than 3% disk space at the cost of about 9% performance
> > loss. If we can do the same the other way around I'd go for wasting some
> > more disk space.
> >
> >
> > Jan
> >
> > --
> >
> > #======================================================================#
> > # It's easier to get forgiveness for being wrong than for being right. #
> > # Let's break this rule - forgive me. #
> > #================================================== JanWieck(at)Yahoo(dot)com #
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 1: subscribe and unsubscribe commands go to majordomo(at)postgresql(dot)org
> >
>
> --
> Bruce Momjian | http://candle.pha.pa.us
> pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
> + If your life is a hard drive, | 830 Blythe Avenue
> + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2002-06-25 20:20:36 Re: Reduce heap tuple header size
Previous Message Rod Taylor 2002-06-25 19:48:22 Re: Postgres idea list

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2002-06-25 20:20:36 Re: Reduce heap tuple header size
Previous Message Rod Taylor 2002-06-25 19:45:23 Re: Dependency / Constraint patch