Re: Increasing default value for effective_io_concurrency?

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Increasing default value for effective_io_concurrency?
Date: 2019-07-02 08:03:22
Message-ID: 20190702080322.6yoo6rbmvg4xvo3d@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 01, 2019 at 04:32:15PM -0700, Andres Freund wrote:
>Hi,
>
>On 2019-06-29 22:15:19 +0200, Tomas Vondra wrote:
>> I think we should consider changing the effective_io_concurrency default
>> value, i.e. the guc that determines how many pages we try to prefetch in
>> a couple of places (the most important being Bitmap Heap Scan).
>
>Maybe we need improve the way it's used / implemented instead - it seems
>just too hard to determine the correct setting as currently implemented.
>

Sure, if we can improve those bits, that'd be nice. It's definitely hard
to decide what value is appropriate for a given storage system. But I'm
not sure it's something we can do easily, considering how opaque the
hardware is for us ...

I wonder

>
>> In some cases it helps a bit, but a bit higher value (4 or 8) performs
>> significantly better. Consider for example this "sequential" data set
>> from the 6xSSD RAID system (x-axis shows e_i_c values, pct means what
>> fraction of pages matches the query):
>
>I assume that the y axis is the time of the query?
>

The y-axis is the fraction of table matched by the query. The values in
the contingency table are query durations (average of 3 runs, but the
numbers vere very close).

>How much data is this compared to memory available for the kernel to do
>caching?
>

Multiple of RAM, in all cases. The queries were hitting random subsets of
the data, and the page cache was dropped after each test, to eliminate
cross-query caching.

>
>> pct 0 1 4 16 64 128
>> ---------------------------------------------------------------
>> 1 25990 18624 3269 2219 2189 2171
>> 5 88116 60242 14002 8663 8560 8726
>> 10 120556 99364 29856 17117 16590 17383
>> 25 101080 184327 79212 47884 46846 46855
>> 50 130709 309857 163614 103001 94267 94809
>> 75 126516 435653 248281 156586 139500 140087
>>
>> compared to the e_i_c=0 case, it looks like this:
>>
>> pct 1 4 16 64 128
>> ----------------------------------------------------
>> 1 72% 13% 9% 8% 8%
>> 5 68% 16% 10% 10% 10%
>> 10 82% 25% 14% 14% 14%
>> 25 182% 78% 47% 46% 46%
>> 50 237% 125% 79% 72% 73%
>> 75 344% 196% 124% 110% 111%
>>
>> So for 1% of the table the e_i_c=1 is faster by about ~30%, but with
>> e_i_c=4 (or more) it's ~10x faster. This is a fairly common pattern, not
>> just on this storage system.
>>
>> The e_i_c=1 can perform pretty poorly, especially when the query matches
>> large fraction of the table - for example in this example it's 2-3x
>> slower compared to no prefetching, and higher e_i_c values limit the
>> damage quite a bit.
>
>I'm surprised the slowdown for small e_i_c values is that big - it's not
>obvious to me why that is. Which os / os version / filesystem / io
>scheduler / io scheduler settings were used?
>

This is the system with NVMe storage, and SATA RAID:

Linux bench2 4.19.26 #1 SMP Sat Mar 2 19:50:14 CET 2019 x86_64 Intel(R)
Xeon(R) CPU E5-2620 v4 @ 2.10GHz GenuineIntel GNU/Linux

/dev/nvme0n1p1 on /mnt/data type ext4 (rw,relatime)
/dev/md0 on /mnt/raid type ext4 (rw,relatime,stripe=48)

The other system looks pretty much the same (same kernel, ext4).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Prabhat Sahu 2019-07-02 08:11:47 Attached partition not considering altered column properties of root partition.
Previous Message Masahiko Sawada 2019-07-02 07:58:56 Re: [PATCH] Speedup truncates of relation forks