Quick Links

Re: Sampling-based timing for EXPLAIN ANALYZE

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	David Geier <geidav(dot)pg(at)gmail(dot)com>
Cc:	Jelte Fennema <me(at)jeltef(dot)nl>, Lukas Fittl <lukas(at)fittl(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Sampling-based timing for EXPLAIN ANALYZE
Date:	2023-01-15 20:22:12
Message-ID:	20230115202212.oh7qu7oaeyr5mzsz@awork3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2023-01-13 09:11:06 +0100, David Geier wrote:
> Mostly I'm wondering if the sampling based approach gains us enough to be
> worth it, once the patch to use RDTSC hopefully landed (see [1]).

Well, I'm not sure we have a path forward on it. There's portability and
accuracy concerns. But more importantly:

> I believe that with the RDTSC patch the overhead of TIMING ON is lower than
> the overhead of using ANALYZE with TIMING OFF in the first place. Hence, to
> be really useful, it would be great if we could on top of TIMING SAMPLING
> also lower the overhead of ANALYZE itself further (e.g. by using a fast path
> for the default EXPLAIN (ANALYZE, TIMING ON / SAMPLING)). Currently,
> InstrStartNode() and InstrStopNode() have a ton of branches and without all
> the typically deactivated code the implementation would be very small and
> could be placed in an inlinable function.

Yes, I think SAMPLING could get rid of most of the instrumentation overhead -
at the cost of a bit less detail in the explain, of course. Which could make
it a lot more feasible to enable something like auto_explain.log_timing in
busy workloads.

For the sampling mode we don't really need something like
InstrStart/StopNode. We just need a pointer to node currently executing - not
free to set, but still a heck of a lot cheaper than InstrStopNode(), even
without ->need_timer etc. Then the timer just needs to do
instr->sampled_total += (now - last_sample)
last_sample = now

I've been thinking that we should consider making more of the instrumentation
code work like that. The amount of work we're doing in InstrStart/StopNode()
has steadily risen. When buffer usage and WAL usage are enabled, we're
executing over 1000 instructions! And each single Instrumentation node is ~450
bytes, to a good degree due to having 2 BufUsage and 2 WalUsage structs
embedded.

If we instead have InstrStartNode() set up a global pointer to the
Instrumentation node, we can make the instrumentation code modify both the
"global" counters (pgBufferUsage, pgWalUsage) and, if set,
current_instr->{pgBufferUsage, pgWalUsage}. That'll require some larger
changes - right now nodes "automatically" include the IO/WAL incurred in child
nodes, but that's just a small bit of additional summin-up to be done during
EXPLAIN.

Separately, I think we should consider re-ordering Instrumentation so that
bufusage_start, walusage_start are after the much more commonly used
elements. We're forcing ntuples, nloops, .. onto separate cachelines, even
though they're accounted for unconditionally.

Greetings,

Andres Freund

In response to

Re: Sampling-based timing for EXPLAIN ANALYZE at 2023-01-13 08:11:06 from David Geier

Responses

Re: Sampling-based timing for EXPLAIN ANALYZE at 2023-01-17 14:52:07 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2023-01-15 20:36:31	Re: logrep stuck with 'ERROR: int2vector has too many elements'
Previous Message	Tom Lane	2023-01-15 20:17:16	Re: logrep stuck with 'ERROR: int2vector has too many elements'