Re: vacuum, performance, and MVCC

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Csaba Nagy <nagy(at)ecircle-ag(dot)com>, Hannu Krosing <hannu(at)skype(dot)net>, Mark Woodward <pgsql(at)mohawksoft(dot)com>, "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, Christopher Browne <cbbrowne(at)acm(dot)org>
Subject: Re: vacuum, performance, and MVCC
Date: 2006-06-27 03:03:19
Message-ID: 20060627030319.GC44573@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 26, 2006 at 02:32:59PM -0400, Bruce Momjian wrote:
>
> It is certainly possible to do what you are suggesting, that is have two
> index entries point to same chain head, and have the index access
> routines figure out if the index qualifications still hold, but that
> seems like a lot of overhead.
>
> Also, once there is only one visible row in the chain, removing old
> index entries seems quite complex because you have to have vacuum keep
> the qualifications of each row to figure out which index tuple is the
> valid one (seems messy).

Perhaps my point got lost... in the case where no index keys change
during an update, SITC seems superior in every way to my proposal. My
idea (let's call it Index Tuple Page Consolidation, ITPC) would be
beneficial to UPDATEs that modify one or more index keys but still put
the tuple on the same page. Where SITC would be most useful for tables
that have a very heavy update rate and very few indexes, ITPC would
benefit tables that have more indexes on them; where presumably it's
much more likely for UPDATEs to change at least one index key (which
means SITC goes out the window, if I understand it correctly). If I'm
missing something and SITC can in fact deal with some index keys
changing during an UPDATE, then I see no reason for ITPC.

> ---------------------------------------------------------------------------
>
> Jim C. Nasby wrote:
> > On Sun, Jun 25, 2006 at 09:13:48PM +0300, Heikki Linnakangas wrote:
> > > >If you can't expire the old row because one of the indexed columns was
> > > >modified, I see no reason to try to reduce the additional index entries.
> > >
> > > It won't enable early expiration, but it means less work to do on update.
> > > If there's a lot of indexes, not having to add so many index tuples can be
> > > a significant saving.
> >
> > While catching up on this thread, the following idea came to me that I
> > think would allow for not updating an index on an UPDATE if it's key
> > doesn't change. If I understand Bruce's SITC proposal correctly, this
> > would differ in that SITC requires that no index keys change.
> >
> > My idea is that if an UPDATE places the new tuple on the same page as
> > the old tuple, it will not create new index entries for any indexes
> > where the key doesn't change. This means that when fetching tuples from
> > the index, ctid would have to be followed until you found the version
> > you wanted OR you found the first ctid that pointed to a different page
> > (because that tuple will have it's own index entry) OR you found a tuple
> > with a different value for the key of the index you're using (because
> > it'd be invalid, and there'd be a different index entry for it). I
> > believe that the behavior of the index hint bits would also have to
> > change somewhat, as each index entry would now essentially be pointing
> > at all the tuples in the ctid chain that exist on a page, not just
> > single tuple.
> >
> > In the case of an UPDATE that needs to put the new tuple on a different
> > page, our current behavior would be used. This means that the hint bits
> > would still be useful in limiting the number of heap pages you hit. I
> > also believe this means that we wouldn't suffer any additional overhead
> > from our current code when there isn't much free space on pages.
> >
> > Since SITC allows for in-page space reuse without vacuuming only when no
> > index keys change, it's most useful for very heavily updated tables such
> > as session handlers or queue tables, because those tables typically have
> > very few indexes, so it's pretty unlikely that an index key will change.
> > For more general-purpose tables that have more indexes but still see a
> > fair number of updates to a subset of rows, not having to update every
> > index would likely be a win. I also don't see any reason why both
> > options couldn't be used together.
> > --
> > Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
> > Pervasive Software http://pervasive.com work: 512-231-6117
> > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
> >
>
> --
> Bruce Momjian bruce(at)momjian(dot)us
> EnterpriseDB http://www.enterprisedb.com
>
> + If your life is a hard drive, Christ can be your backup. +
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>

--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2006-06-27 03:08:24 Re: vacuum, performance, and MVCC
Previous Message Bruce Momjian 2006-06-27 02:49:30 Re: [HACKERS] PQescapeIdentifier