Re: Reducing tuple overhead

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Sawada Masahiko <sawada(dot)mshk(at)gmail(dot)com>, Greg Stark <stark(at)mit(dot)edu>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: Reducing tuple overhead
Date: 2015-06-08 05:39:25
Message-ID: CAA4eK1JJXEbUyaOy+49O42sFrHRH9h2jtos+8OMk4mYPKOY_SQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 7, 2015 at 3:02 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On 23 April 2015 at 17:24, Andres Freund <andres(at)anarazel(dot)de> wrote:
>>
>
>
> It's hard to see how to save space there without reference to a specific
use case. I see different solutions depending upon whether we assume a low
number of transactions or a high number of transactions.
>

I have tried to check theoretically, how much difference such a
change could give us.

Assuming BLK_SZ - 8192 bytes; Page header - 24 bytes; each
line pointer - 4 bytes; average tuple - 150 bytes, roughly 53
tuples could be accommodated in one page. Now each of this
tuple contains 12 bytes transaction information (xmin, xmax,
cid/combocid). Now considering that in average workload 4
transactions operate on a page at the same time (I think for a
workload like pgbench tpc-b, it shouldn't be more otherwise it
should have been visible in perf reports), 4 additional tuples [1]
could be accommodated on a page which is approximately 7% savings
in space (which in-turns means that much less I/O). This gain
could vary based on tuple size, no. of transactions that can operate
on page, page size..

Some additional benefits that I could see from such a change:

1. I think we don't need to traverse the whole page while freezing,
so there should be some savings in freeze operation as well.

2. Now I think with this, we might be able to reduce WAL also
if we can avoid some transaction related info in the cases where
it is currently stored (update/delete).

3. Another gain could come if we want to add transaction information
in index segment as well, because if such information can be stored at
page level, then there won't be much impact in adding it there which
will help us in avoiding multiple-passes of Vaccum (heap and index
could be vacuumed separately which will definitely help in IO and
bloat reduction).

[1]
Calc for 4 additional tuples:
(saving by removing trans info from tuple - new space consumed by trans)
/new tuple size
(53 * 12 - 12 * 4) / (150 - 12) = 4.26

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2015-06-08 06:09:48 Re: amcheck prototype
Previous Message Michael Paquier 2015-06-08 04:57:32 Collection of memory leaks for ECPG driver