Re: POC: Extension for adding distributed tracing - pg_tracing

From: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
To: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
Cc: Nikita Malakhov <hukutoc(at)gmail(dot)com>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Extension for adding distributed tracing - pg_tracing
Date: 2023-11-15 15:20:04
Message-ID: CAO6_XqqfkB0NqZSm0ZiKjiPGZc7=m_urLbtMkT13vWWN+vcynw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Some small changes, mostly around making tests less flaky:
- Removed the top_span and nested_level in the span output, those were
mostly used for debugging
- More tests around Parse span in nested queries
- Remove explicit queryId in the tests

Regards,
Anthonin

On Mon, Nov 6, 2023 at 4:08 PM Anthonin Bonnefoy
<anthonin(dot)bonnefoy(at)datadoghq(dot)com> wrote:
>
> Hi!
>
> The commit 13d00729d422c84b1764c24251abcc785ea4adb1 renamed some
> fields in the BufferUsage struct. I've updated those which should fix the
> compilation errors.
>
> I've also added some changes, mainly around how lazy functions are managed.
> Some queries like 'select character_maximum_length from
> information_schema.element_types limit 1 offset 3;' could generate 50K spans
> due to lazy calls of generate_series. Tracing of lazy functions is now disabled.
>
> I've also written an otel version of the trace forwarder available in
> https://github.com/bonnefoa/pg-tracing-otel-forwarder. The repository
> contains a docker compose to start a local otel collector and jaeger
> interface and will be able to consume the spans from a test PostgreSQL
> instance with pg_tracing.
>
> Regards,
> Anthonin
>
> On Mon, Nov 6, 2023 at 7:19 AM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > On Thu, 12 Oct 2023 at 14:32, Anthonin Bonnefoy
> > <anthonin(dot)bonnefoy(at)datadoghq(dot)com> wrote:
> > >
> > > Hi!
> > >
> > > I've made a new batch of changes and improvements.
> > > New features:
> > > - Triggers are now correctly supported. They were not correctly
> > > attached to the ExecutorFinish Span before.
> > > - Additional configuration: exporting parameter values in span
> > > metadata can be disabled. Deparsing can also be disabled now.
> > > - Dbid and userid are now exported in span's metadata
> > > - New Notify channel and threshold parameters. This channel will
> > > receive a notification when the span buffer usage crosses the
> > > threshold.
> > > - Explicit transaction with extended protocol is now correctly
> > > handled. This is done by keeping 2 different trace contexts, one for
> > > parser/planner trace context at the root level and the other for
> > > nested queries and executors. The spans now correctly show the parse
> > > of the next statement happening before the ExecutorEnd of the previous
> > > statement (see screenshot).
> > >
> > > Changes:
> > > - Parse span is now outside of the top span. When multiple statements
> > > are processed, they are parsed together so it didn't make sense to
> > > attach Parse to a specific statement.
> > > - Deparse info is now exported in its own column. This will leave the
> > > possibility to the trace forwarder to either use it as metadata or put
> > > it in the span name.
> > > - Renaming: Spans now have a span_type (Select, Executor, Parser...)
> > > and a span_operation (ExecutorRun, Select $1...)
> > > - For spans created without propagated trace context, the same traceid
> > > will be used for statements within the same transaction.
> > > - pg_tracing_consume_spans and pg_tracing_peek_spans are now
> > > restricted to users with pg_read_all_stats role
> > > - In instrument.h, instr->firsttime is only set if timer was requested
> > >
> > > The code should be more stable from now on. Most of the features I had
> > > planned are implemented.
> > > I've also started writing the module's documentation. It's still a bit
> > > bare but I will be working on completing this.
> >
> > In CFbot, I see that build is failing for this patch (link:
> > https://cirrus-ci.com/task/5378223450619904?logs=build#L1532) with
> > following error:
> > [07:58:05.037] In file included from ../src/include/executor/instrument.h:16,
> > [07:58:05.037] from ../src/include/jit/jit.h:14,
> > [07:58:05.037] from ../contrib/pg_tracing/span.h:16,
> > [07:58:05.037] from ../contrib/pg_tracing/pg_tracing_explain.h:4,
> > [07:58:05.037] from ../contrib/pg_tracing/pg_tracing.c:43:
> > [07:58:05.037] ../contrib/pg_tracing/pg_tracing.c: In function
> > ‘add_node_counters’:
> > [07:58:05.037] ../contrib/pg_tracing/pg_tracing.c:2330:70: error:
> > ‘BufferUsage’ has no member named ‘blk_read_time’; did you mean
> > ‘temp_blk_read_time’?
> > [07:58:05.037] 2330 | blk_read_time =
> > INSTR_TIME_GET_MILLISEC(node_counters->buffer_usage.blk_read_time);
> > [07:58:05.037] | ^~~~~~~~~~~~~
> > [07:58:05.037] ../src/include/portability/instr_time.h:126:12: note:
> > in definition of macro ‘INSTR_TIME_GET_NANOSEC’
> > [07:58:05.037] 126 | ((int64) (t).ticks)
> > [07:58:05.037] | ^
> > [07:58:05.037] ../contrib/pg_tracing/pg_tracing.c:2330:18: note: in
> > expansion of macro ‘INSTR_TIME_GET_MILLISEC’
> > [07:58:05.037] 2330 | blk_read_time =
> > INSTR_TIME_GET_MILLISEC(node_counters->buffer_usage.blk_read_time);
> > [07:58:05.037] | ^~~~~~~~~~~~~~~~~~~~~~~
> > [07:58:05.037] ../contrib/pg_tracing/pg_tracing.c:2331:71: error:
> > ‘BufferUsage’ has no member named ‘blk_write_time’; did you mean
> > ‘temp_blk_write_time’?
> > [07:58:05.037] 2331 | blk_write_time =
> > INSTR_TIME_GET_MILLISEC(node_counters->buffer_usage.blk_write_time);
> > [07:58:05.037] | ^~~~~~~~~~~~~~
> > [07:58:05.037] ../src/include/portability/instr_time.h:126:12: note:
> > in definition of macro ‘INSTR_TIME_GET_NANOSEC’
> > [07:58:05.037] 126 | ((int64) (t).ticks)
> > [07:58:05.037] | ^
> > [07:58:05.037] ../contrib/pg_tracing/pg_tracing.c:2331:19: note: in
> > expansion of macro ‘INSTR_TIME_GET_MILLISEC’
> > [07:58:05.037] 2331 | blk_write_time =
> > INSTR_TIME_GET_MILLISEC(node_counters->buffer_usage.blk_write_time);
> > [07:58:05.037] | ^~~~~~~~~~~~~~~~~~~~~~~
> >
> > I also tried to build this patch locally and the build is failing with
> > the same error.
> >
> > Thanks
> > Shlok Kumar Kyal

Attachment Content-Type Size
v10-0001-Add-pg_tracing-extension-to-contrib.patch application/octet-stream 323.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2023-11-15 15:52:34 Re: typo in fallback implementation for pg_atomic_test_set_flag()
Previous Message Tom Lane 2023-11-15 15:15:55 Re: BUG #18097: Immutable expression not allowed in generated at