Re: Mark/Restore and avoiding RandomAccess sorts

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Mark/Restore and avoiding RandomAccess sorts
Date: 2007-01-08 18:43:24
Message-ID: 20070108184324.GR12217@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 08, 2007 at 10:37:25AM +0000, Heikki Linnakangas wrote:
> >Simon Riggs wrote:
> >>Implementing the variable mark/restore buffer as a dumb Tuplestore would
> >>mean that the space usage of the Sort could in worst case go as high as
> >>x2 total space. The worst case is where the inner scan is all a single
> >>value. The best case is where the inner scan is sufficiently unique over
> >>all its values that it never writes back to disk at all.
> >>
> >>So a further refinement of this idea would be to simply defer the final
> >>merge operation for the sort until the history required for the Mark
> >>operation exceeded, say, 10% of the sort size. That would then be
> >>sufficient to improve performance for most common cases, without risking
> >>massive space overflow for large and highly non-unique data. There's no
> >>problem with running the final merge slightly later than before;
> >>everything's still there to allow it. Reusing space in the tuplestore is
> >>also straightforward since that's exactly what the final merge already
> >>does, so some rework of that code should be sufficient.
>
> Should definitely be done by reusing the space in the tuplestore, we
> don't want to use double the space we do now in the degenerate case.

Another idea comes to mind, which would apply to all sorts needing
random access.

Rather than completely copying every tuple to build a resultset you can
step through randomnly, why not just build a list of tuple locations as
the sort returns results? That would allow seeking to any position in
the resultset with minimal overhead.
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Stefan Kaltenbrunner 2007-01-08 19:38:37 weird buildfarm failures on arm/mipsel and --with-tcl
Previous Message D'Arcy J.M. Cain 2007-01-08 17:49:08 Re: pgsql: Widen the money type to 64 bits.