Re: Merge algorithms for large numbers of "tapes"

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Merge algorithms for large numbers of "tapes"
Date: 2006-03-08 16:35:04
Message-ID: 1141835704.27729.749.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2006-03-08 at 10:21 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> > 1. Earlier we had some results that showed that the heapsorts got slower
> > when work_mem was higher and that concerns me most of all right now.
>
> Fair enough, but that's completely independent of the merge algorithm.
> (I don't think the Nyberg results necessarily apply to our situation
> anyway, as we are not sorting arrays of integers, and hence the cache
> effects are far weaker for us. I don't mind trying alternate sort
> algorithms, but I'm not going to believe an improvement in advance of
> direct evidence in our own environment.)

Of course, this would be prototyped first...and I agree about possible
variability of those results for us.

> > 2. Improvement in the way we do overall memory allocation, so we would
> > not have the problem of undersetting work_mem that we currently
> > experience. If we solved this problem we would have faster sorts in
> > *all* cases, not just extremely large ones. Dynamically setting work_mem
> > higher when possible would be very useful.
>
> I think this would be extremely dangerous, as it would encourage
> processes to take more than their fair share of available resources.

Fair share is the objective. I was trying to describe the general case
so we could discuss a solution that would allow a dynamic approach
rather than the static one we have now.

Want to handle these cases: "How much to allocate, when..."
A. we have predicted number of users
B. we have a busy system - more than predicted number of users
C. we have a quiet system - less than predicted number of users

In B/C we have to be careful that we don't under/overallocate resources
only to find the situation changes immediately afterwards.

In many cases the static allocation is actually essential since you may
be more interested in guaranteeing a conservative run time rather than
seeking to produce occasional/unpredictable bursts of speed. But in many
cases people want to have certain tasks go faster when its quiet and go
slower when its not.

> Also, to the extent that you believe the problem is insufficient L2
> cache, it seems increasing work_mem to many times the size of L2 will
> always be counterproductive.

Sorry to confuse: (1) and (2) were completely separate, so no intended
interaction between L2 cache and memory.

> (Certainly there is no value in increasing
> work_mem until we are in a regime where it consistently improves
> performance significantly, which it seems we aren't yet.)

Very much agreed.

Best Regards, Simon Riggs

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2006-03-08 16:41:34 Re: problem with large maintenance_work_mem settings and
Previous Message Stephen Frost 2006-03-08 16:31:25 Re: Running out of disk space during query