Re: Trouble with hashagg spill I/O pattern and costing

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Trouble with hashagg spill I/O pattern and costing
Date: 2020-05-21 14:45:10
Message-ID: 20200521144510.2kz5rw2hjqdnf3iz@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, May 21, 2020 at 02:12:55AM +0200, Tomas Vondra wrote:
>
> ...
>
>I agree that's pretty nice. I wonder how far would we need to go before
>reaching a plateau. I'll try this on the other machine with temporary
>tablespace on SATA, but that'll take longer.
>

OK, I've managed to get some numbers from the other machine, with 75GB
data set and temp tablespace on SATA RAID. I haven't collected I/O data
using iosnoop this time, because we already know how that changes from
the other machine. I've also only done this with 128MB work_mem, because
of how long a single run takes, and with 128 blocks pre-allocation.

The patched+tlist means both pre-allocation and with the tlist tweak
I've posted to this thread a couple minutes ago:

master patched patched+tlist
-----------------------------------------------------
sort 485 472 462
hash 24686 3060 559

So the pre-allocation makes it 10x faster, and the tlist tweak makes it
5x faster. Not bad, I guess.

Note: I've slightly tweaked read-ahead on the RAID device(s) on those
patched runs, but the effect was pretty negligible (compared to other
patched runs with the old read-ahead setting).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pantelis Theodosiou 2020-05-21 14:50:12 Re: PostgreSQL 13 Beta 1 Release Announcement Draft
Previous Message Amit Kapila 2020-05-21 14:30:50 Re: Behaviour of failed Primary