Re: Tuplesort merge pre-reading

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Tuplesort merge pre-reading
Date: 2016-09-06 19:08:33
Message-ID: CAM3SWZQ4yzxiytLyATuGjxOeowUOjGNmYh1-q5HFY2HEJzZ_VQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 6, 2016 at 5:20 AM, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> I wrote a quick patch to test that, attached. It seems to improve
> performance, at least in this small test case:
>
> create table lotsofints(i integer);
> insert into lotsofints select random() * 1000000000.0 from
> generate_series(1, 10000000);
> vacuum freeze;
>
> select count(*) FROM (select * from lotsofints order by i) t;
>
> On my laptop, with default work_mem=4MB, that select takes 7.8 s on
> unpatched master, and 6.2 s with the attached patch.

The benefits have a lot to do with OS read-ahead, and efficient use of
memory bandwidth during the merge, where we want to access the caller
tuples sequentially per tape (i.e. that's what the batch memory stuff
added -- it also made much better use of available memory). Note that
I've been benchmarking the parallel CREATE INDEX patch on a server
with many HDDs, since sequential performance is mostly all that
matters. I think that in 1999, the preloading had a lot more to do
with logtape.c's ability to aggressively recycle blocks during merges,
such that the total storage overhead does not exceed the original size
of the caller tuples (plus what it calls "trivial bookkeeping
overhead" IIRC). That's less important these days, but still matters
some (it's more of an issue when you can't complete the sort in one
pass, which is rare these days).

Offhand, I would think that taken together this is very important. I'd
certainly want to see cases in the hundreds of megabytes or gigabytes
of work_mem alongside your 4MB case, even just to be able to talk
informally about this. As you know, the default work_mem value is very
conservative.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2016-09-06 19:12:59 Re: [GENERAL] C++ port of Postgres
Previous Message Petr Jelinek 2016-09-06 19:04:56 Re: Logical Replication WIP