Re: vacuum, performance, and MVCC

From: "Mark Woodward" <pgsql(at)mohawksoft(dot)com>
To: "Hannu Krosing" <hannu(at)skype(dot)net>
Cc: "postgres hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: vacuum, performance, and MVCC
Date: 2006-06-27 12:48:19
Message-ID: 18357.24.91.171.78.1151412499.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> Ühel kenal päeval, E, 2006-06-26 kell 09:10, kirjutas Mark Woodward:
>> > Ühel kenal päeval, R, 2006-06-23 kell 17:27, kirjutas Bruce
>> Momjian:
>> >> Jonah H. Harris wrote:
>> >> > On 6/23/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> >> > > What I see in this discussion is a huge amount of "the grass must
>> be
>> >> > > greener on the other side" syndrome, and hardly any recognition
>> that
>> >> > > every technique has its downsides and complications.
>> >> >
>> >> > I'm being totally objective. I don't think we should abandon
>> >> > PostgreSQL's overall design at all, because we do perform INSERTs
>> and
>> >> > DELETEs much better than most systems. However, I've looked at
>> many
>> >> > systems and how they implement UPDATE so that it is a scalable
>> >> > operation. Sure, there are costs and benefits to each
>> implementation,
>> >> > but I think we have some pretty brilliant people in this community
>> and
>> >> > can come up with an elegant design for scalable UPDATEs.
>> >>
>> >> I think the UPDATE case is similar to the bitmap index scan or
>> perhaps
>> >> bitmap indexes on disk --- there are cases we know can not be handled
>> >> well by our existing code, so we have added (or might add) these
>> >> features to try to address those difficult cases.
>> >
>> > Not really. Bitmap index scan and bitmap index are both new additions
>> > working well with existing framework.
>> >
>> > While the problem of slowdown on frequent updates is real, the
>> suggested
>> > fix is just plain wrong, as it is based on someones faulty assumption
>> on
>> > how index lookup works, and very much simplified view of how different
>> > parts of the system work to implement MVCC.
>>
>> Yes, the suggestion was based on MVCC concepts, not a particular
>> implementation.
>
> On the contrary - afaik, it was loosely based on how Oracle does it with
> its rollback segments, only assuming that rollback segments are kept in
> heap and that indexes point only to the oldest row version :p

Well, give me a little more credit than that. Yes, Oracle did play small
part in my thinking, but only in as much as "they can't do it, why can't
we?" The problem was how to get the most recent tuple to be more efficient
and not have tuples that will never be used impact performance without
excessive locking or moving data around.

It was a just a quick idea. Bruce's solution, you have to admit, is
somewhat similar.

>
>> > The original fix he "suggests" was to that imagined behaviour and thus
>> > ignored all the real problems of such change.
>>
>> The original suggestion, was nothing more than a hypothetical for the
>> purpose of discussion.
>>
>> The problem was the steady degradation of performance on frequent
>> updates.
>> That was the point of discussion. I brought up "one possible way" to
>> start a "brain storm." The discussion then morphed into critisizing the
>> example and not addressing the problem.
>
> The problem is heatedly discussed every 3-4 months.

And yet, here we are again.

>
>> Anyway, I think some decent discussion about the problem did happen, and
>> that is good.
>
> Agreed.
>
> Maybe this _was_ the best way to bring up the discussion again.

I have a way, for better or worse, I guess, of stirring up the pot. :-)

Cry as we may about MySQL, but I have a sneaking suspicion that this is
one of the issues that puts PostgreSQL at a serious disadvantage.

While heavily updated rows are a single type of problem, these days I
think *most* database deployments are as back-ends for web sites. This
problem is *very* critical to that type of application, consequently
probably why PostgreSQL has difficulty in that space.

If PostgreSQL can be made *not* to suffer performance degradation on
heavily updated rows, then that is realy the last issue in the way of it
being a completely creadible medium to large enterprise back end. This
combined with its amazing pragramability, should make it unstoppable.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Woodward 2006-06-27 13:03:55 Re: vacuum, performance, and MVCC
Previous Message Teodor Sigaev 2006-06-27 11:58:02 Re: GIN index creation extremely slow ?