Re: UNDO and in-place update

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UNDO and in-place update
Date: 2016-11-24 16:40:46
Message-ID: 20161124164046.GE1668@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 24, 2016 at 04:06:14PM +0000, Greg Stark wrote:
> For snapshot isolation Oracle has yet a *third* copy of the data in a
> space called the "rollback segment(s)". When you update a row in a
> block you save the whole block in the rollback segment. When you try
> to access a block you check if the CSN -- which is basically
> equivalent to our LSN -- is newer than your snapshot and if it is you
> fetch the old version of the block from the rollback.

I assume they can do this with good performance because, unlike UNDO,
they don't need to fsync the pages kept for snapshot visibility --- if
the server crashes, you don't need them.

> Essentially their MVCC is done on a per-block level rather than a
> per-row level and they keep only the newest version of the block in
> the table, the rest are in the rollback segment. For what it's worth
> I think our approach is cleaner and more flexible. They had a lot of
> trouble with their approach over the years and it works well only
> because they invested an enormous amount of development in it and also
> because people throw a lot of hardware at it too.
>
> I think the main use case we have trouble with is actually the "update
> every row in the table" type of update which requires we write to
> every block, plus a second copy of every block, plus write full pages
> of both copies, then later set hint bits dirtying pages again and
> generating more full pages writes, then later come along and vacuum
> which requires two more writes of every block, etc. If we had a
> solution for the special case of an update that replaces every row in
> a page that I think would complement HOT nicely and go a long way
> towards fixing our issues.

Any ideas how we could optimize the update-all workload? Create a new
heap file and indexes?

Our previous worst-case workload was a single row that was updated
repeatedly, but HOT fixed that for non-indexed rows, and WARM will
improve it for indexed rows.

One of the big takeaways when writing my MVCC talk is that no matter how
much we optimize UPDATE, we still need cleanup for deletes and aborted
transactions.

I am hopeful we can continue reducing the number of times we write a
page for maintenance purposes.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-11-24 16:51:24 Re: [bugfix] commit timestamps ERROR on lookup of FrozenTransactionId
Previous Message Robert Haas 2016-11-24 16:39:45 Re: UNDO and in-place update