Re: polyphase merge?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Greg Stark <stark(at)enterprisedb(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Don Marvick <donmarvick(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: polyphase merge?
Date: 2009-02-04 15:17:55
Message-ID: 2245.1233760675@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark <stark(at)enterprisedb(dot)com> writes:
> Is this basically the same as our current algorithm but without
> multiplexing the tapes onto single files? I have been wondering
> whether we multiplex the tapes any better than filesystems can lay out
> separate files actually.

The reason for the multiplexing is so that space can get re-used
quickly. If each tape were represented as a separate file, there would
be no way to release blocks as they're read; you could only give back
the whole file after reaching end of tape. Which would at least double
the amount of disk space needed to sort X amount of data. (It's
actually even worse, more like 4X, though the multiplier might depend on
the number of "tapes" --- I don't recall the details anymore.)

The penalty we pay is that in the later merge passes, the blocks
representing a single tape aren't very well ordered.

It might be interesting to think about some compromise that wastes a
little more space in order to get better sequentiality of disk access.
It'd be easy to do if we were willing to accept a 2X space penalty,
but I'm not sure if that would fly or not. It definitely *wasn't*
acceptable to the community a few years ago when the current code was
written. Disks have gotten bigger since then, but so have the problems
people want to solve.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jonah H. Harris 2009-02-04 15:19:02 Re: add_path optimization
Previous Message Greg Stark 2009-02-04 15:12:37 Re: add_path optimization