Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore
Date: 2015-09-02 14:12:25
Message-ID: 20150902141225.GB5286@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-07-19 16:34:52 -0700, Peter Geoghegan wrote:
> +# PGAC_C_BUILTIN_PREFETCH
> +# -------------------------
> +# Check if the C compiler understands __builtin_prefetch(),
> +# and define HAVE__BUILTIN_PREFETCH if so.
> +AC_DEFUN([PGAC_C_BUILTIN_PREFETCH],
> +[AC_CACHE_CHECK(for __builtin_prefetch, pgac_cv__builtin_prefetch,
> +[AC_COMPILE_IFELSE([AC_LANG_PROGRAM([],
> +[int i = 0;__builtin_prefetch(&i, 0, 3);])],
> +[pgac_cv__builtin_prefetch=yes],
> +[pgac_cv__builtin_prefetch=no])])
> +if test x"$pgac_cv__builtin_prefetch" = xyes ; then
> +AC_DEFINE(HAVE__BUILTIN_PREFETCH, 1,
> + [Define to 1 if your compiler understands __builtin_prefetch.])
> +fi])# PGAC_C_BUILTIN_PREFETCH

Hm. Is a compiler test actually test anything reliably here? Won't this
just throw a warning during compile time about an unknown function?

> +/*
> + * Prefetch support -- Support memory prefetching hints on some platforms.
> + *
> + * pg_rfetch() is specialized for the case where an array is accessed
> + * sequentially, and we can prefetch a pointer within the next element (or an
> + * even later element) in order to hide memory latency. This case involves
> + * prefetching addresses with low temporal locality. Note that it's rather
> + * difficult to get any kind of speedup using pg_rfetch(); any use of the
> + * intrinsic should be carefully tested. Also note that it's okay to pass it
> + * an invalid or NULL address, although it's best avoided.
> + */
> +#if defined(USE_MEM_PREFETCH)
> +#define pg_rfetch(addr) __builtin_prefetch((addr), 0, 0)
> +#endif

What is that name actually supposed to mean? 'rfetch' doesn't seem to be
particularly clear - or is it a meant to be a wordplay combined with the
p?

I don't think you should specify the read/write and locality parameters
if we don't hand-pick them - right now you're saying the memory will
only be read and that it has no temporal locality.

I think there should be a empty fallback definition even if the current
only caller ends up not needing it - not all callers will require it.

> + /*
> + * Perform memory prefetch of tuple that's three places
> + * ahead of current (which is returned to caller).
> + * Testing shows that this significantly boosts the
> + * performance for TSS_INMEM "forward" callers by
> + * hiding memory latency behind their processing of
> + * returned tuples.
> + */
> +#ifdef USE_MEM_PREFETCH
> + if (readptr->current + 3 < state->memtupcount)
> + pg_rfetch(state->memtuples[readptr->current + 3]);
> +#endif

Hm. Why do we need a branch here? The target of prefetches don't need to
be valid addresses and adding a bunch of arithmetic and a branch for the
corner case doesn't sound worthwhile to me.

What worries me about adding explicit prefetching is that it's *highly*
platform and even micro-architecture dependent. Why is looking three
elements ahead the right value?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2015-09-02 14:14:52 Re: track_commit_timestamp and COMMIT PREPARED
Previous Message Bruce Momjian 2015-09-02 14:05:58 Re: Horizontal scalability/sharding