Re: Tuplesort merge pre-reading

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Claudio Freire <klaussfreire(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Tuplesort merge pre-reading
Date: 2016-09-28 16:11:52
Message-ID: CAM3SWZTv6TOqtAE9n-o7+b=xF-TWfvo0nznSjRX1xX3SBgvETw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 28, 2016 at 5:04 PM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>> Not sure that I understand. I agree that each merge pass tends to use
>> roughly the same number of tapes, but the distribution of real runs on
>> tapes is quite unbalanced in earlier merge passes (due to dummy runs).
>> It looks like you're always using batch memory, even for non-final
>> merges. Won't that fail to be in balance much of the time because of
>> the lopsided distribution of runs? Tapes have an uneven amount of real
>> data in earlier merge passes.
>
>
> How does the distribution of the runs on the tapes matter?

The exact details are not really relevant to this discussion (I think
it's confusing that we simply say "Target Fibonacci run counts",
FWIW), but the simple fact that it can be quite uneven is.

This is why I never pursued batch memory for non-final merges. Isn't
that what you're doing here? You're pretty much always setting
"state->batchUsed = true".

>> I'm basically repeating myself here, but: I think it's incorrect that
>> LogicalTapeAssignReadBufferSize() is called so indiscriminately (more
>> generally, it is questionable that it is called in such a high level
>> routine, rather than the start of a specific merge pass -- I said so a
>> couple of times already).
>
>
> You can't release the tape buffer at the end of a pass, because the buffer
> of a tape will already be filled with data from the next run on the same
> tape.

Okay, but can't you just not use batch memory for non-final merges,
per my initial approach? That seems far cleaner.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2016-09-28 16:20:23 Re: Tuplesort merge pre-reading
Previous Message Heikki Linnakangas 2016-09-28 16:04:48 Re: Tuplesort merge pre-reading