Re: Logical tape pause/resume

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)heroku(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>
Subject: Re: Logical tape pause/resume
Date: 2016-10-04 11:09:05
Message-ID: CANP8+jKVDt+GPmepFd8saKUNQY26_6xvQZT4VU0cRBrb7A-daA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4 October 2016 at 11:47, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:

> When we finish writing an initial run to a tape, we keep the tape buffers
> around. That's the reason we need to reserve that memory. But there's no
> fundamental reason we have to do that. As I suggested in [2], we could flush
> the buffers to disk, and only keep in memory the location of the last block
> we wrote. If we need to write another run to the tape, we can reload the
> last incomplete block from the disk, and continue writing.
>
> Reloading the last block, requires an extra I/O. That's OK. It's quite rare
> to have a multi-pass sort these days, so it's a good bet that you don't need
> to do it. And if you have a very small work_mem, so that you need to do a
> multi-pass sort, having some more memory available for building the initial
> runs is probably worth it, and there's also a good chance that the block is
> still in the OS cache.

Sounds like a good idea.

Why not just make each new run start at a block boundary?
That way we waste on average BLCKSZ/2 disk space per run, which is
negligible but we avoid any need to have code to read back in the last
block.

That also makes it easier to put each run in its own file, which will
surely help with parallelism and compression since we can spread
multiple run files across multiple temp tablespaces.

Can we also please read in the value from the last tuple in a run, so
we have min and max values in memory for each run? That can be used
during the merge step to avoid merging runs unless the value ranges
overlap.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2016-10-04 11:25:10 Incorrect comment/doc for poll_query_until
Previous Message Heikki Linnakangas 2016-10-04 10:47:30 Logical tape pause/resume