Skip site navigation (1) Skip section navigation (2)

Re: tuplesort memory usage: grow_memtuples

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: Andres Freund <andres(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Greg S <stark(at)mit(dot)edu>
Subject: Re: tuplesort memory usage: grow_memtuples
Date: 2013-01-17 06:00:15
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
Peter Geoghegan <peter(at)2ndquadrant(dot)com> writes:
> I took another look at this.

Since Greg S. seems to have lost interest in committing this, I am
picking it up.

> My strategy for preventing overflow is to use a uint64, and to use
> Min()/Max() as appropriate. As Robert mentioned, even a 64-bit integer
> could overflow here, and I account for that.

It seems to me that there's a much more robust way to do this, namely
use float8 arithmetic.  Anybody have a problem with this version of
the last-increment branch?

         * This will be the last increment of memtupsize.  Abandon doubling
         * strategy and instead increase as much as we safely can.
         * To stay within allowedMem, we can't increase memtupsize by more
         * than availMem / sizeof(SortTuple) elements.  In practice, we want
         * to increase it by considerably less, because we need to leave some
         * space for the tuples to which the new array slots will refer.  We
         * assume the new tuples will be about the same size as the tuples
         * we've already seen, and thus we can extrapolate from the space
         * consumption so far to estimate an appropriate new size for the
         * memtuples array.  The optimal value might be higher or lower than
         * this estimate, but it's hard to know that in advance.
         * This calculation is definitely safe against enlarging the array so
         * much that LACKMEM becomes true, because the memory currently used
         * includes the present array; thus, there would be enough allowedMem
         * for the new array elements even if no other memory were currently
         * used.
         * We do the arithmetic in float8, because otherwise the product of
         * memtupsize and allowedMem would be quite likely to overflow.  Any
         * inaccuracy in the result should be insignificant, but just for
         * paranoia's sake, we bound the result to be 1 to 2 times the
         * existing value.  (A little algebra shows that grow_ratio must be
         * less than 2 here, so except for roundoff error the Min/Max bounds
         * should never do anything.)
         * Note: it might seem that we need to worry about memtupsize * 2
         * overflowing an int, but the MaxAllocSize bound applied below will
         * ensure that can't happen.
        double        grow_ratio;

        grow_ratio = (double) state->allowedMem / (double) memNowUsed;
        newmemtupsize = (int) (memtupsize * grow_ratio);

        newmemtupsize = Max(newmemtupsize, memtupsize);
        newmemtupsize = Min(newmemtupsize, memtupsize * 2);

        /* We won't make any further enlargement attempts */
        state->growmemtuples = false;

> I also added a brief note within tuplestore.c to the effect that the
> two buffer sizing strategies are not in sync.

My inclination is to just copy the whole grow_memtuples function into
tuplestore.c too.  There's no very good reason why tuplestore should
be stupider than tuplesort about this.

			regards, tom lane

In response to

pgsql-hackers by date

Next:From: Tom LaneDate: 2013-01-17 06:06:13
Subject: Re: CF3+4
Previous:From: Pavan DeolaseeDate: 2013-01-17 05:47:40
Subject: Re: Hot Standby conflict resolution handling

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group