Re: TRACE_SORT defined by default

From: Craig Ringer <craig(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Joe Conway <mail(at)joeconway(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: TRACE_SORT defined by default
Date: 2019-10-29 08:54:47
Message-ID: CAMsr+YHa-7T+TzVTE4GxR-8mzu2H_-jDRri4_euyJAQs+31nWw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 25 Apr 2019 at 06:41, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Wed, Apr 24, 2019 at 3:04 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > It is disabled by default, in the sense that the trace_sort GUC
> > > defaults to off. I believe that the overhead of building in the
> > > instrumentation without enabling it is indistinguishable from zero.
> >
> > It would probably be useful to actually prove that rather than just
> > assuming it.
>
> The number of individual trace_sort LOG messages is proportionate to
> the number of runs produced.
>
> > I do see some code under the symbol that is executed
> > even when !trace_sort, and in any case Andres keeps complaining that
> > even always-taken branches are expensive ...
>
> I think that you're referring to the stuff needed for the D-Trace
> probes. It's a pity that there isn't better support for that, since
> Linux has a lot for options around static userspace probes these days
> (SystemTap is very much out of favor, and never was very popular).

Which is a real shame. I got into it last week and I cannot believe
I've wasted time and effort trying to get anything useful out of perf
when it comes to tracing. There's just no comparison.

As usual, the incoming replacement tools are broken, incompatible and
incomplete, especially for userspace. Because really, bah, users, who
cares about users? But yet issues with systemtap are being dismissed
with variants of "that's obsolete, use perf or eBPF-tools". Right,
just like a CnC router can be easily replaced with a chisel.

I expect eBPF-tools will be pretty amazing once it matures. But for
those of us stuck in "working with actual user applications" land, on
RHEL6 and RHEL7 and the like, it won't be doing so in a hurry.

With that said, static probes are useful but frustrating. The
dtrace-alike model SystemTap adopted, and that we use in Pg via
SystemTap's 'dtrace' script, generates quite dtrace-alike static probe
points, complete with the frustrating deficiencies of those probe
points. Most importantly, they don't record the probe argument names
or data types, and if you want to get a string value you need to
handle each string probe argument individually.

Static probes are fantastic as labels and they let you see the program
state in places that are often impractical for debuginfo-based DWARF
probing since they give you a stable, consistent way to see something
other than function args and return values. But they're frustratingly
deficient compared to DWARF-based probes in other ways.

> There seems to be a recognition among the Linux people that the
> distinction between users and backend experts is blurred. The kernel
> support for things like eBPF and BCC is still patchy, but that will
> change.

Just in time for it to be deemed obsolete and replaced with another
tool that only works with the kernel for the first couple of years,
probably. Like perf was.

> > Would any non-wizard really have a use for it?

Extremely strongly yes.

If nothing else you can wrap these tools up into toolsets and scripts
that give people insights into running systems just by running the
script. Non-invasively, without code changes, on an existing running
system.

I wrote a systemtap script script last week that tracks each
transaction in a Pg backend from xid allocation though to
commit/rollback/prepare/commitprepared/rollbackprepared and takes
stats on xid allocations, txn durations, etc. Then figures out if the
committed txn needs logical decoding by any existing slots and tracks
how long each txn takes between commit until a given logical walsender
finishes decoding it and sending it. Collects stats on
inserts/updates/deletes etc in each txn, txn size, etc. Then follows
the txn and observes it being received and applied by a logical
replication client (pglogical). So it can report latencies and
throughputs in every step through the logical replication pipeline.

Right now that script isn't pretty, and it's not something easily
reused. But I could wrap it up in a script that turned on/off parts
based on what a user needed, wrote the stats to csv for
postprocessing, etc.

The underlying tool isn't for end users. But the end result sure can
be. After all, we don't expect users to mess with xact.c and
transam.c, we just expect them to run SQL, but we don't say PostgreSQL
is only for wizards not end users.

For that reason I'm trying to find time to add a large pile more probe
points to PostgreSQL. Informed in part by what I learned writing the
above script. I want probes for WaitEventSetWait, transam activities
(especially xid allocation, commit, rollback), 2pc, walsender
activity, reorderbuffer, walreceiver, slots, global xmin/catalog_xmin
changes, writes, file flushes, buffer access, and lots more. (Pg's
existing probes on transaction start and finish are almost totally
useless as you can't tell if the txn then gets an xid allocated,
whether the commit generates an xlog record or not, etc).

--
Craig Ringer http://www.2ndQuadrant.com/
2ndQuadrant - PostgreSQL Solutions for the Enterprise

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2019-10-29 09:00:21 Re: TRACE_SORT defined by default
Previous Message Peter Eisentraut 2019-10-29 08:40:48 Re: MinGW compiler warnings in ecpg tests