Re: Small improvement to compactify_tuples

From: Claudio Freire <klaussfreire(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Юрий Соколов <funny(dot)falcon(at)gmail(dot)com>, Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, PostgreSQL-Dev <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Small improvement to compactify_tuples
Date: 2017-11-08 16:29:15
Message-ID: CAGTBQpZgs4hVmDtHg4F6MDJuAE9-U=z6d+yJzjPsZup7AZhV0Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 8, 2017 at 12:33 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Tue, Nov 7, 2017 at 4:39 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> What I'm getting from the standard pgbench measurements, on both machines,
>>> is that this patch might be a couple percent slower than HEAD, but that is
>>> barely above the noise floor so I'm not too sure about it.
>
>> Hmm. It seems like slowing down single client performance by a couple
>> of percent is something that we really don't want to do.
>
> I do not think there is any change here that can be proven to always be a
> win. Certainly the original patch, which proposes to replace an O(n log n)
> sort algorithm with an O(n^2) one, should not be thought to be that.
> The question to focus on is what's the average case, and I'm not sure how
> to decide what the average case is. But more than two test scenarios
> would be a good start.
>
> regards, tom lane

Doing no change to the overall algorithm and replacing qsort with an
inlineable type-specific one should be a net win in all cases.

Doing bucket sort with a qsort of large buckets (or small tuple
arrays) should also be a net win in all cases.

Using shell sort might not seem clear, but lets not forget the
original patch only uses it in very small arrays and very infrequently
at that.

What's perhaps not clear is whether there are better ideas. Like
rebuilding the page as Tom proposes, which doesn't seem like a bad
idea. Bucket sort already is O(bytes), just as memcopy, only it has a
lower constant factor (it's bytes/256 in the original patch), which
might make copying the whole page an extra time lose against bucket
sort in a few cases.

Deciding that last point does need more benchmarking. That doesn't
mean the other improvements can't be pursued in the meanwhile, right?

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-11-08 16:46:45 Re: taking stdbool.h into use
Previous Message Peter Geoghegan 2017-11-08 16:26:40 Re: Small improvement to compactify_tuples