The main goal for this Generic Monitoring Framework is to provide a common interface for adding instrumentation points or probes to
Postgres so its behavior can be easily observed by developers and administrators even in production systems. This framework will allow Postgres to use the appropriate
monitoring/tracing facility provided by each OS. For example, Solaris and FreeBSD will use DTrace, and other OSes can use their respective tool.
What is DTrace?
Some of you may have heard about or used DTrace already. In a nutshell, DTrace is a comprehensive dynamic tracing facility that is built into
Solaris and FreeBSD (mostly working) that can be used by administrators and developers on live production systems to examine the behavior
of both user programs and of the operating system.
DTrace can help answer difficult questions about the OS and the application itself. For example, you may want to ask:
- Show all functions that get invoked (userland & kernel) and execution time when my function foo() is called. Seeing the path a function
takes into the kernel may provide clues for performance tuning.
- Show how many times a particular lock is acquired and how long it's held. This can help identity contentions in the system.
The best way to appreciate DTrace capabilities is by seeing a demo or through hands-on experience, and I plan to show some interesting
demos at the PG Summit.
There are a numer of docs on Dtrace, and here's a quick start doc and a complete reference guide.
Here is a recent DTrace for FreeBSD status
Open source apps that provide user level probes (bottom of page)
This solution is actually quite simple and non-intrusive.
1. Define macros PG_TRACE, PG_TRACE1, etc, in a new header file
called pg_trace.h with multiple #if defined(xxx) sections for Solaris,
FreeBSD, Linux, etc, and add pg_trace.h to c.h which is included in postgres.h
and included by every C file.
The macros will have the following format:
PG_TRACE[n](module_name, probe_name [, arg1, ..., arg5])
module_name = Name to identify PG module such as pg_backend, pg_psql, pg_plpgsql, etc
probe_name = Probe name such as transaction_start, lwlock_acquire, etc
arg1..arg5 = Any args to pass to the probe such as txn id, lock id, etc
2. Map PG_TRACE, PG_TRACE1, etc, to macros or functions appropriate for each OS.
For OSes that don't have suitable tracing facility, just map the macros to nothing - doing this will not have any affect on performance or
Sample of pg_trace.h
#if defined(sun) || defined(FreeBSD)
#define PG_TRACE DTRACE_PROBE
#define PG_TRACE1 DTRACE_PROBE1
#define PG_TRACE5 DTRACE_PROBE5
#elif defined(__linux__) || defined(_AIX) || defined(__sgi) ...
/* Map the macros to no-ops */
#define PG_TRACE(module, name)
#define PG_TRACE1(module, name, arg1)
#define PG_TRACE5(module, name, arg1, arg2, arg3, arg4, arg5)
3. Add any file(s) to support the particular OS tracing facility
4. Update the Makefiles as necessary for each OS
How to add probes:
To add a probe, just add a one line macro in the appropriate location in the source. Here's an example of two probes, one with no argument
and the other with 2 arguments:
PG_TRACE (pg_backend, fsync_start);
PG_TRACE2 (pg_backend, lwlock_acquire, lockid, mode);
If there are enough probes embedded in PG, its behavior can be easily observed.
With the help of Gavin Sherry, we have added about 20 probes, and Gavin has suggested a number of other interesting areas for additional probes.
Pervasive has also added some probes to PG 8.0.4 and posted the patch on http://pgfoundry.org/projects/dtrace/. I hope to combine the probes
using this generic framework for 8.1.4, and make it available for folks to try.
Since my knowledge of the PG source code is limited, I'm looking for assistance from experts to hep identify some new interesting probe points.
How to use probes:
For DTrace, probes can be enabled using a D script. When the probes are not enabled, there is absolutely no performance hit whatsoever.
Here is a simple example to print out the number of LWLock counts for each PG process.
@foo[pid] = count();
printf("\n%10s %15s\n", "PID", "Count");
I have a prototype working, so if anyone wants to try it, I can provide a patch or give access to my test system.
This is a proposal, so comments, suggestions, feedbacks are certainly welcome.
pgsql-hackers by date
|Next:||From: Simon Riggs||Date: 2006-06-19 20:13:28|
|Subject: Re: Getting rid of extra gettimeofday() calls|
|Previous:||From: Simon Riggs||Date: 2006-06-19 19:53:32|
|Subject: Re: sync_file_range()|