Re: About tapes

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "mac_man2005(at)hotmail(dot)it" <mac_man2005(at)hotmail(dot)it>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: About tapes
Date: 2010-06-18 21:29:23
Message-ID: AANLkTimEuAyvn34NykLpbMISZGLIZc8sEffsZy5SeMKP@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 18, 2010 at 3:46 PM, mac_man2005(at)hotmail(dot)it
<mac_man2005(at)hotmail(dot)it> wrote:
> Which is the difference between having more than one tape into a file and
> having one tape per file?

It makes it easier to recycle space a little at a time. Suppose
you're merging two runs of 100 blocks each. You read in a block from
each run and write out two output blocks. Now that you've done that,
the first block of each of the input runs is garbage and can be
recycled - but if the input runs and the output run are in three
separate files, there's no easy way to do that. You can truncate a
file (and throw away the end) but there's no easy way to throw away
the BEGINNING of a file. So you'll probably have to hold on to the
entirety of both inputs until you've written the entirety of the
output.

On the other hand, suppose you have all the blocks in one big file.
The first input run is in blocks 1-100; the second is in blocks
101-200. You can read blocks 1 and 101, say, and write the results to
blocks 201 and 202, using extra storage, but only a little bit. When
you then read blocks 2 and 102, you write the results to blocks 1 and
100, which are no longer needed, because you've already merged them.
When you get done with that, blocks 2 and 102 are now no longer needed
and can be used to write the next part of the output. Of course, you
have to keep track of which order to reread the blocks in when the
sort is done: 201, 202, 1, 101, ... but that's a manageable problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2010-06-18 21:35:16 Re: hstore ==> and deprecate =>
Previous Message mac_man2005@hotmail.it 2010-06-18 19:46:35 Re: About tapes