Re: Allow a per-tablespace effective_io_concurrency setting

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Allow a per-tablespace effective_io_concurrency setting
Date: 2015-09-02 16:06:54
Message-ID: 55E71E9E.4090403@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

Hi

On 09/02/2015 03:53 PM, Andres Freund wrote:
>
> Hi,
>
> On 2015-07-18 12:17:39 +0200, Julien Rouhaud wrote:
>> I didn't know that the thread must exists on -hackers to be able to add
>> a commitfest entry, so I transfer the thread here.
>
> Please, in the future, also update the title of the thread to something
> fitting.
>
>> @@ -539,6 +541,9 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
>> {
>> BitmapHeapScanState *scanstate;
>> Relation currentRelation;
>> +#ifdef USE_PREFETCH
>> + int new_io_concurrency;
>> +#endif
>>
>> /* check for unsupported flags */
>> Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
>> @@ -598,6 +603,25 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
>> */
>> currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
>>
>> +#ifdef USE_PREFETCH
>> + /* check if the effective_io_concurrency has been overloaded for the
>> + * tablespace storing the relation and compute the target_prefetch_pages,
>> + * or just get the current target_prefetch_pages
>> + */
>> + new_io_concurrency = get_tablespace_io_concurrency(
>> + currentRelation->rd_rel->reltablespace);
>> +
>> +
>> + scanstate->target_prefetch_pages = target_prefetch_pages;
>> +
>> + if (new_io_concurrency != effective_io_concurrency)
>> + {
>> + double prefetch_pages;
>> + if (compute_io_concurrency(new_io_concurrency, &prefetch_pages))
>> + scanstate->target_prefetch_pages = rint(prefetch_pages);
>> + }
>> +#endif
>
> Maybe it's just me - but imo there should be as few USE_PREFETCH
> dependant places in the code as possible. It'll just be 0 when not
> supported, that's fine?

It's not just you. Dealing with code with plenty of ifdefs is annoying -
it compiles just fine most of the time, until you compile it with some
rare configuration. Then it either starts producing strange warnings or
the compilation fails entirely.

It might make a tiny difference on builds without prefetching support
because of code size, but realistically I think it's noise.

> Especially changing the size of externally visible structs depending
> ona configure detected ifdef seems wrong to me.

+100 to that

>> + /*----------
>> + * The user-visible GUC parameter is the number of drives (spindles),
>> + * which we need to translate to a number-of-pages-to-prefetch target.
>> + * The target value is stashed in *extra and then assigned to the actual
>> + * variable by assign_effective_io_concurrency.
>> + *
>> + * The expected number of prefetch pages needed to keep N drives busy is:
>> + *
>> + * drives | I/O requests
>> + * -------+----------------
>> + * 1 | 1
>> + * 2 | 2/1 + 2/2 = 3
>> + * 3 | 3/1 + 3/2 + 3/3 = 5 1/2
>> + * 4 | 4/1 + 4/2 + 4/3 + 4/4 = 8 1/3
>> + * n | n * H(n)
>
> I know you just moved this code. But: I don't buy this formula. Like at
> all. Doesn't queuing and reordering entirely invalidate the logic here?

Well, even the comment right next after the formula says that:

* Experimental results show that both of these formulas aren't
* aggressiveenough, but we don't really have any better proposals.

That's the reason why users generally either use 0 or some rather high
value (16 or 32 are the most common values see). The problem is that we
don't really care about the number of spindles (and not just because
SSDs don't have them at all), but about the target queue length per
device. Spinning rust uses TCQ/NCQ to optimize the head movement, SSDs
are parallel by nature (stacking multiple chips with separate channels).

I doubt we can really improve the formula, except maybe for saying "we
want 16 requests per device" and multiplying the number by that. We
don't really have the necessary introspection to determine better values
(and it's not really possible, because the devices may be hidden behind
a RAID controller (or a SAN). So we can't really do much.

Maybe the best thing we can do is just completely abandon the "number of
spindles" idea, and just say "number of I/O requests to prefetch".
Possibly with an explanation of how to estimate it (devices * queue length).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2015-09-02 16:41:55 Re: Hooking at standard_join_search (Was: Re: Foreign join pushdown vs EvalPlanQual)
Previous Message Andres Freund 2015-09-02 15:38:34 Re: Improving test coverage of extensions with pg_dump

Browse pgsql-performance by date

  From Date Subject
Next Message Andres Freund 2015-09-02 16:45:56 Re: Allow a per-tablespace effective_io_concurrency setting
Previous Message Volker Böhm 2015-09-02 14:00:47 query with pg_trgm sometimes very slow