Re: Remove xmin and cmin from frozen tuples

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Remove xmin and cmin from frozen tuples
Date: 2005-09-01 15:34:09
Message-ID: 1125588849.3956.183.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2005-08-31 at 22:25 -0400, Alvaro Herrera wrote:
> On Thu, Sep 01, 2005 at 10:45:44AM +0900, ITAGAKI Takahiro wrote:
>
> Hi,
>
> > I think it would be a waste to retain xmin and cmin for frozen tuples
> > because their values represent only 'visible for all transactions'.
> > Additionally, most tuples in database can be frozen potentially.
>
> I think this is an interesting idea.

Agreed, especially since it would avoid the need to vacuum altogether.

> I was thinking that when the tuple
> needs to be obsoleted it would need to grow to accomodate the Xmax, but
> you are not actually proposing to remove that, so it seems sensible. In
> fact, it is perfectly reasonable to remove Xmin and Cmin, because after
> the tuple is frozen, the Xmin never changes again.

It's a good idea, but the Xmin is set to FrozenTransactionId, which is
how we know it is frozen, so how can we remove Xmin? The way to do this
is surely by using a row version id that is different for this format.

Getting 8 or 16 bytes per row back would be a very useful gain.

> Now, one thing of note is that you need to "compress" the page in order
> to actually be able to use the just-freed space. VACUUM could do that,
> but maybe it would be better to do it on-line -- the freezing process is
> going to have to write the page regardless. I wonder if with your patch
> the page is compressed on the same VACUUM execution that freezes the
> tuple?

Only if you do a FULL, which is currently incompatible with a FREEZE.

There's no point in compressing a block if you can't also redistribute
rows between blocks to fill up the spaces, so another reason why it has
to be a FULL. Unless you do this at load time, which is why I guess you
mention....

> One thing that comes to mind is that this makes somewhat easier to build
> a tool to write pre-built tables, for bulk-loading purposes. You just
> construct the binary file with the HEAP_FROZEN bit set, and then attach
> the file to a dummy table. (Then again, you can do it today, using a
> Xmin of FrozenTransactionId. I wonder why the Bizgres people isn't
> advocating a tool to do that. It is very hard to do with user-defined
> types, but for BI/DW you mostly don't need those, do you?)

Loading a table using COPY with frozen bits set was suggested in May, so
yeh... it was suggested. At that time it was rejected, since earlier
transactions would then be able to see rows they ought not be able to
see. Thinking some more about this, this is only the inverse situation
of a TRUNCATE. With truncate we remove tuples that ought to still be
visible to pre-existing transactions. So there shouldn't really be an
issue with loading pre-frozen tuples - as long as you accept the
consequences for row visibility.

Externally writing blocks is possible, but it bypasses a lot of other
features. My current preference would be to have bulk_heap_insert()
function to add a whole page at a time rather than inserting rows one at
at a time. The main objective for a load is to make it disk bound; once
we've achieved that by some further tuning, writing an external file
would cost around the same as writing it internally from the DBMS.
Oracle (direct path loader) and Teradata (Fastload) load data in
complete blocks using a reduced code pathway, so I guess I was just
following on, but I'm genuinely open to further persuasion if there is a
better way.

Having a table marked as INSERT ONLY would allow us to save 8 bytes/row,
loading it pre-frozen (in some way) would save another 8 bytes/row and
allow us to permanently avoid VACUUMing the table. That would be even
better when we have per-table XID wrap avoidance.

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew - Supernews 2005-09-01 15:34:56 Re: TODO item: set proper permissions on non-system schemas
Previous Message Peter Eisentraut 2005-09-01 15:30:34 Version number in psql banner