Re: Increasing default value for effective_io_concurrency?

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Increasing default value for effective_io_concurrency?
Date: 2019-07-01 23:32:15
Message-ID: 20190701233215.wdimoypumnshwbl5@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-06-29 22:15:19 +0200, Tomas Vondra wrote:
> I think we should consider changing the effective_io_concurrency default
> value, i.e. the guc that determines how many pages we try to prefetch in
> a couple of places (the most important being Bitmap Heap Scan).

Maybe we need improve the way it's used / implemented instead - it seems
just too hard to determine the correct setting as currently implemented.

> In some cases it helps a bit, but a bit higher value (4 or 8) performs
> significantly better. Consider for example this "sequential" data set
> from the 6xSSD RAID system (x-axis shows e_i_c values, pct means what
> fraction of pages matches the query):

I assume that the y axis is the time of the query?

How much data is this compared to memory available for the kernel to do
caching?

> pct 0 1 4 16 64 128
> ---------------------------------------------------------------
> 1 25990 18624 3269 2219 2189 2171
> 5 88116 60242 14002 8663 8560 8726
> 10 120556 99364 29856 17117 16590 17383
> 25 101080 184327 79212 47884 46846 46855
> 50 130709 309857 163614 103001 94267 94809
> 75 126516 435653 248281 156586 139500 140087
>
> compared to the e_i_c=0 case, it looks like this:
>
> pct 1 4 16 64 128
> ----------------------------------------------------
> 1 72% 13% 9% 8% 8%
> 5 68% 16% 10% 10% 10%
> 10 82% 25% 14% 14% 14%
> 25 182% 78% 47% 46% 46%
> 50 237% 125% 79% 72% 73%
> 75 344% 196% 124% 110% 111%
>
> So for 1% of the table the e_i_c=1 is faster by about ~30%, but with
> e_i_c=4 (or more) it's ~10x faster. This is a fairly common pattern, not
> just on this storage system.
>
> The e_i_c=1 can perform pretty poorly, especially when the query matches
> large fraction of the table - for example in this example it's 2-3x
> slower compared to no prefetching, and higher e_i_c values limit the
> damage quite a bit.

I'm surprised the slowdown for small e_i_c values is that big - it's not
obvious to me why that is. Which os / os version / filesystem / io
scheduler / io scheduler settings were used?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-07-01 23:33:51 Re: Usage of epoch in txid_current
Previous Message Tom Lane 2019-07-01 23:27:21 Re: POC: converting Lists into arrays