Re: [HACKERS] strange behavior of UPDATE

From: Bruce Momjian <maillist(at)candle(dot)pha(dot)pa(dot)us>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Edmund Mergl <E(dot)Mergl(at)bawue(dot)de>, PostgreSQL Hackers Mailinglist <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: [HACKERS] strange behavior of UPDATE
Date: 1999-07-07 21:13:20
Message-ID: 199907072113.RAA10846@candle.pha.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Added to TODO:

* Improve _bt_binsrch() to handle equal keys better, remove _bt_firsteq()(Tom)

> I wrote:
> > then a possible explanation for the problem would be that our
> > btree index code isn't very fast when there are large numbers of
> > identical keys.
>
> Ah-hah, a lucky guess! I went back and looked at the profile stats
> I'd extracted from Edmund's "update" example. This Linux box has
> the same semi-functional gprof as someone else was using a while
> ago --- the timings are bogus, but the call counts seem right.
> And what I find are entries like this:
>
> 0.00 0.00 284977/284977 _bt_binsrch [3174]
> [3177] 0.0 0.00 0.00 284977 _bt_firsteq [3177]
> 0.00 0.00 21784948/24713758 _bt_compare [3169]
>
> 0.00 0.00 426/35632 _bt_split [53]
> 0.00 0.00 35206/35632 _bt_insertonpg [45]
> [3185] 0.0 0.00 0.00 35632 _bt_findsplitloc [3185]
> 0.00 0.00 5093972/8907411 _bt_itemcmp [3171]
>
> In other words, _bt_firsteq is averaging almost 100 comparisons per
> call, _bt_findsplitloc well over that. Both of these routines are
> evidently designed on the assumption that there will be relatively
> few duplicate keys --- they reduce to linear scans when there are
> many duplicates.
>
> _bt_firsteq shouldn't exist at all; the only routine that calls it
> is _bt_binsrch, which does a fast binary search of the index page.
> _bt_binsrch should be fixed so that the binary search logic does the
> right thing for equal-key cases, rather than degrading to a linear
> search. I am less confident that I understand _bt_findsplitloc's place
> in the great scheme of things, but it could certainly be revised to use
> a binary search rather than linear scan.
>
> This benchmark is clearly overemphasizing the equal-key case, but
> I think it ought to be fixed independently of whether we want to
> look good on a dubious benchmark ... equal keys are not uncommon in
> real-world scenarios after all.
>
> Next question is do we want to risk twiddling this code so soon before
> 6.5, or wait till after?
>
> regards, tom lane
>
>

--
Bruce Momjian | http://www.op.net/~candle
maillist(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 1999-07-07 21:32:45 Re: [HACKERS] Help: fmgr_info: function 0: cache lookup failed
Previous Message Bruce Momjian 1999-07-07 20:56:59 Re: [HACKERS] DEFAULT fixed