Re: asynchronous and vectorized execution

From: Andres Freund <andres(at)anarazel(dot)de>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, Mark Wong <mark(at)2ndquadrant(dot)com>
Cc: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: asynchronous and vectorized execution
Date: 2016-05-11 00:50:16
Message-ID: 20160511005016.j3m7wkk6cafx2ccr@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016-05-10 12:56:17 -0400, Robert Haas wrote:
> I suspect the number of queries that are being hurt by fmgr overhead
> is really large, and I think it would be nice to attack that problem
> more directly. It's a bit hard to discuss what's worthwhile in the
> abstract, without performance numbers, but when you vectorize, how
> much is the benefit from using SIMD instructions and how much is the
> benefit from just not going through the fmgr every time?

I think fmgr overhead is an issue, but in most profiles of execution
heavy loads I've seen the bottlenecks are elsewhere. They often seem to
roughly look like
+ 15.47% postgres postgres [.] slot_deform_tuple
+ 12.99% postgres postgres [.] slot_getattr
+ 10.36% postgres postgres [.] ExecMakeFunctionResultNoSets
+ 9.76% postgres postgres [.] heap_getnext
+ 6.34% postgres postgres [.] HeapTupleSatisfiesMVCC
+ 5.09% postgres postgres [.] heapgetpage
+ 4.59% postgres postgres [.] hash_search_with_hash_value
+ 4.36% postgres postgres [.] ExecQual
+ 3.30% postgres postgres [.] ExecStoreTuple
+ 3.29% postgres postgres [.] ExecScan

or

- 33.67% postgres postgres [.] ExecMakeFunctionResultNoSets
- ExecMakeFunctionResultNoSets
+ 99.11% ExecEvalOr
+ 0.89% ExecQual
+ 14.32% postgres postgres [.] slot_getattr
+ 5.66% postgres postgres [.] ExecEvalOr
+ 5.06% postgres postgres [.] check_stack_depth
+ 5.02% postgres postgres [.] slot_deform_tuple
+ 4.05% postgres postgres [.] pgstat_end_function_usage
+ 3.69% postgres postgres [.] heap_getnext
+ 3.41% postgres postgres [.] ExecEvalScalarVarFast
+ 3.36% postgres postgres [.] ExecEvalConst

with a healthy dose of _bt_compare, heap_hot_search_buffer in more index
heavy workloads.

(yes, I just pulled these example profiles from somewhere, but I've more
often seen them look like this, than very fmgr heavy).

That seems to suggest that we need to restructure how we get to calling
fmgr functions, before worrying about the actual fmgr call.

Tomas, Mark, IIRC you'd both generated perf profiles for TPC-H (IIRC?)
queries at some point. Any chance the results are online somewhere?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2016-05-11 00:51:43 Re: ALTER TABLE lock downgrades have broken pg_upgrade
Previous Message Jeff Janes 2016-05-11 00:36:06 Re: HeapTupleSatisfiesToast() busted? (was atomic pin/unpin causing errors)