Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore

From: Andres Freund <andres(at)anarazel(dot)de>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore
Date: 2015-09-02 22:13:13
Message-ID: 20150902221313.GA8555@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2015-09-02 12:24:33 -0700, Peter Geoghegan wrote:
> "Read fetch". One argument past to the intrinsic here specifies that
> the variable will be read only. I did things this way because I
> imagined that there would be very limited uses for the macro only. I
> probably cover almost all interesting cases for explicit memory
> prefetching already.

I'd be less brief in this case then, no need to be super short here.

> > I don't think you should specify the read/write and locality parameters
> > if we don't hand-pick them - right now you're saying the memory will
> > only be read and that it has no temporal locality.
> >
> > I think there should be a empty fallback definition even if the current
> > only caller ends up not needing it - not all callers will require it.
>
> It started out that way, but Tom felt that it was better to have a
> USE_MEM_PREFETCH because of the branch below...

That doesn't mean we shouldn't still provide an empty definition.

> > Hm. Why do we need a branch here? The target of prefetches don't need to
> > be valid addresses and adding a bunch of arithmetic and a branch for the
> > corner case doesn't sound worthwhile to me.
>
> ...which is needed, because I'm still required to not dereference wild
> pointers. > In other words, although pg_rfetch()/__builtin_prefetch()
> does not require that a valid pointer be passed, it is not okay to
> read past an array's bounds to read that pointer. The GCC docs are
> clear on this -- "Data prefetch does not generate faults if addr is
> invalid, but the address expression itself must be valid".

That's just a question of how to formulate this though?

pg_rfetch(((char *) state->memtuples ) + 3 * sizeof(SortTuple) + offsetof(SortTuple, tuple))?

For something heavily platform dependent like this that seems ok.

> Because that was the fastest value following testing on my laptop. You
> are absolutely right to point out that this isn't a good reason to go
> with the patch -- I share your concern. All I can say in defense of
> that is that other major system software does the same, without any
> regard for the underlying microarchitecture AFAICT.

I know linux stripped out most prefetches at some point, even from x86
specific code, because it showed that they aged very badly. I.e. they
removed a bunch of them and stuff got faster, whereas they were
beneficial on earlier architectures.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2015-09-02 22:16:58 Re: Improving replay of XLOG_BTREE_VACUUM records
Previous Message Jeff Janes 2015-09-02 22:10:08 Re: Allow a per-tablespace effective_io_concurrency setting