Re: Why percent_rank is so slower than rank?

From: Kenneth Marshall <ktm(at)rice(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Why percent_rank is so slower than rank?
Date: 2010-12-09 23:15:25
Message-ID: 20101209231525.GQ4028@aart.is.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Dec 09, 2010 at 05:18:57PM -0500, Tom Lane wrote:
> I wrote:
> > We're throwing away one tuple at a time as we advance forward through
> > the tuplestore, and moving 100000+ tuple pointers each time. Ugh.
> > This code was all right when written, because (IIRC) the mergejoin
> > case was actually the only caller. But it's not all right for
> > WindowAgg's less-predictable usage patterns.
>
> > I thought for a bit about changing things around so that the first-used
> > tuple slot isn't necessarily state->memtuples[0], but just like the
> > comment says, that complicates a lot of other logic. And there isn't
> > any easy place to reclaim the wasted slots later.
>
> > What seems like the best bet is to put in a heuristic to make
> > tuplestore_trim simply not do anything until nremove reaches some
> > reasonably large amount, perhaps 10% of the number of stored tuples.
> > This wastes up to 10% of the alloted memory, but that seems tolerable.
>
> On reflection I think just not doing anything isn't a very good idea.
> The problem with that is that a mis-coded caller could try to fetch
> tuples that it had already told the tuplestore could be trimmed away;
> and this would work, most of the time, until you got unlucky and the
> trim operation had actually deleted them. I think it's pretty important
> for bug-catching purposes that the tuplestore enforce that those tuples
> are not available anymore.
>
> Hence the attached patch, which combines the two ideas by recycling
> tuples immediately but not sliding the pointer array until a reasonable
> amount of movement has occurred. This fixes the complained-of
> performance problem AFAICT.
>
> I'm not sure whether or not to back-patch this into 9.0 and 8.4. The
> code in tuplestore.c hasn't changed at all since 8.4, so there's not
> much risk of cross-version bugs, but if I did miss anything we could
> be shipping a buggy version next week. Thoughts?
>
> regards, tom lane
>

+1 for back patching.

Ken

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Roberto Icardi 2010-12-10 01:33:39 Re: Our paths have finally crossed
Previous Message Kevin Grittner 2010-12-09 23:08:46 Re: Why percent_rank is so slower than rank?