Re: Tuplesort merge pre-reading

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Claudio Freire <klaussfreire(at)gmail(dot)com>
Subject: Re: Tuplesort merge pre-reading
Date: 2016-09-09 11:13:49
Message-ID: 2a1b4071-5f4f-4dce-e74b-1c5575c11608@iki.fi
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On 09/08/2016 09:59 PM, Heikki Linnakangas wrote:
> On 09/06/2016 10:26 PM, Peter Geoghegan wrote:
>> On Tue, Sep 6, 2016 at 12:08 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>>> Offhand, I would think that taken together this is very important. I'd
>>> certainly want to see cases in the hundreds of megabytes or gigabytes
>>> of work_mem alongside your 4MB case, even just to be able to talk
>>> informally about this. As you know, the default work_mem value is very
>>> conservative.
>
> I spent some more time polishing this up, and also added some code to
> logtape.c, to use larger read buffers, to compensate for the fact that
> we don't do pre-reading from tuplesort.c anymore. That should trigger
> the OS read-ahead, and make the I/O more sequential, like was the
> purpose of the old pre-reading code. But simpler. I haven't tested that
> part much yet, but I plan to run some tests on larger data sets that
> don't fit in RAM, to make the I/O effects visible.

Ok, I ran a few tests with 20 GB tables. I thought this would show any
differences in I/O behaviour, but in fact it was still completely CPU
bound, like the tests on smaller tables I posted yesterday. I guess I
need to point temp_tablespaces to a USB drive or something. But here we go.

It looks like there was a regression when sorting random text, with 256
MB work_mem. I suspect that was a fluke - I only ran these tests once
because they took so long. But I don't know for sure.

Claudio, if you could also repeat the tests you ran on Peter's patch set
on the other thread, with these patches, that'd be nice. These patches
are effectively a replacement for
0002-Use-tuplesort-batch-memory-for-randomAccess-sorts.patch. And review
would be much appreciated too, of course.

Attached are new versions. Compared to last set, they contain a few
comment fixes, and a change to the 2nd patch to not allocate tape
buffers for tapes that were completely unused.

- Heikki

Attachment Content-Type Size
results-large-master.txt text/plain 548 bytes
results-large-patched.txt text/plain 548 bytes
0001-Don-t-bother-to-pre-read-tuples-into-SortTuple-slots.patch text/x-diff 47.1 KB
0002-Use-larger-read-buffers-in-logtape.patch text/x-diff 11.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2016-09-09 11:29:23 [PATCH] Generic type subscription
Previous Message Amit Kapila 2016-09-09 10:54:22 Re: Partition-wise join for join between (declaratively) partitioned tables