Quick Links

Re: Tuplesort merge pre-reading

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Peter Geoghegan <pg(at)heroku(dot)com>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Claudio Freire <klaussfreire(at)gmail(dot)com>
Subject:	Re: Tuplesort merge pre-reading
Date:	2016-09-09 11:13:49
Message-ID:	2a1b4071-5f4f-4dce-e74b-1c5575c11608@iki.fi
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 09/08/2016 09:59 PM, Heikki Linnakangas wrote:
> On 09/06/2016 10:26 PM, Peter Geoghegan wrote:
>> On Tue, Sep 6, 2016 at 12:08 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>>> Offhand, I would think that taken together this is very important. I'd
>>> certainly want to see cases in the hundreds of megabytes or gigabytes
>>> of work_mem alongside your 4MB case, even just to be able to talk
>>> informally about this. As you know, the default work_mem value is very
>>> conservative.
>
> I spent some more time polishing this up, and also added some code to
> logtape.c, to use larger read buffers, to compensate for the fact that
> we don't do pre-reading from tuplesort.c anymore. That should trigger
> the OS read-ahead, and make the I/O more sequential, like was the
> purpose of the old pre-reading code. But simpler. I haven't tested that
> part much yet, but I plan to run some tests on larger data sets that
> don't fit in RAM, to make the I/O effects visible.

Ok, I ran a few tests with 20 GB tables. I thought this would show any
differences in I/O behaviour, but in fact it was still completely CPU
bound, like the tests on smaller tables I posted yesterday. I guess I
need to point temp_tablespaces to a USB drive or something. But here we go.

It looks like there was a regression when sorting random text, with 256
MB work_mem. I suspect that was a fluke - I only ran these tests once
because they took so long. But I don't know for sure.

Claudio, if you could also repeat the tests you ran on Peter's patch set
on the other thread, with these patches, that'd be nice. These patches
are effectively a replacement for
0002-Use-tuplesort-batch-memory-for-randomAccess-sorts.patch. And review
would be much appreciated too, of course.

Attached are new versions. Compared to last set, they contain a few
comment fixes, and a change to the 2nd patch to not allocate tape
buffers for tapes that were completely unused.

- Heikki

Attachment	Content-Type	Size
results-large-master.txt	text/plain	548 bytes
results-large-patched.txt	text/plain	548 bytes
0001-Don-t-bother-to-pre-read-tuples-into-SortTuple-slots.patch	text/x-diff	47.1 KB
0002-Use-larger-read-buffers-in-logtape.patch	text/x-diff	11.1 KB

In response to

Re: Tuplesort merge pre-reading at 2016-09-08 18:59:41 from Heikki Linnakangas

Responses

Re: Tuplesort merge pre-reading at 2016-09-09 11:55:40 from Heikki Linnakangas
Re: Tuplesort merge pre-reading at 2016-09-09 12:01:06 from Heikki Linnakangas
Re: Tuplesort merge pre-reading at 2016-09-10 00:51:08 from Claudio Freire

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Dmitry Dolgov	2016-09-09 11:29:23	[PATCH] Generic type subscription
Previous Message	Amit Kapila	2016-09-09 10:54:22	Re: Partition-wise join for join between (declaratively) partitioned tables