Re: vacuum, performance, and MVCC

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: Hannu Krosing <hannu(at)skype(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Csaba Nagy <nagy(at)ecircle-ag(dot)com>, Mark Woodward <pgsql(at)mohawksoft(dot)com>, "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, Christopher Browne <cbbrowne(at)acm(dot)org>, postgres hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: vacuum, performance, and MVCC
Date: 2006-06-26 02:12:01
Message-ID: 200606260212.k5Q2C1S10045@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jan Wieck wrote:
> > [item1]...[tuple1]
> >
> > becomes on UPDATE:
> > ---------->
> > [item1]...[tuple1][tuple2]
> > ----->
> >
> > on another UPDATE, if tuple1 is no longer visible:
> >
> > ------------------>
> > [item1]...[tuple1][tuple2]
> > <------
> >
> >> Another problem with this is that even if you find such row, it doesn't
> >> spare you the index traversal. The dead row whos item id you're reusing
> >> might have resulted from an insert that aborted or crashed before it
> >> finished creating all index entries. Or some of its index entries might
> >> already be flagged known dead, and you better reset those flags.
> >
> > You can only reuse heap rows that were created and expired by committed
> > transactions. In fact, you can only UPDATE a row that was created by a
> > committed transaction. You cannot _reuse_ any row, but only a row that
> > is being UPDATEd. Also, it cannot be known dead because it are are in
> > the process of updating it.
>
> Now you lost me. What do you mean "a row that is being UPDATEd"? The row
> (version) being UPDATEd right now cannot be expired, or why would you
> update that one? And if your transaction rolls back later, the row you
> update right now must be the one surviving.

It can only be a non-visible version of the row earlier in the UPDATE
chain, not the actual one being updated.

> Any row that was created by a committed transaction does indeed have all
> the index entries created. But if it is deleted and expired, that means
> that the transaction that stamped xmax has committed and is outside of
> every existing snapshot. You can only reuse a slot that is used by a
> tuple that satisfies the vacuum snapshot. And a tuple that satisfies
> that snapshot has potentially index entries flagged known dead.

When you are using the update chaining, you can't mark that index row as
dead because it actually points to more than one row on the page, some
are non-visible, some are visible.

> > I am thinking my idea was not fully understood. Hopefully this email
> > helps.
>
> I must be missing something because I still don't see how it can work.

I just posted pseudo-code. Hope that helps.

--
Bruce Momjian bruce(at)momjian(dot)us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jan Wieck 2006-06-26 05:17:29 Re: vacuum, performance, and MVCC
Previous Message Jan Wieck 2006-06-26 02:08:50 Re: vacuum, performance, and MVCC