Re: POC: Extension for adding distributed tracing - pg_tracing

From: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
To: Jan Katins <jasc(at)gmx(dot)net>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Peter Smith <smithpb2250(at)gmail(dot)com>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Nikita Malakhov <hukutoc(at)gmail(dot)com>, Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>
Subject: Re: POC: Extension for adding distributed tracing - pg_tracing
Date: 2024-02-06 10:13:09
Message-ID: CAO6_Xqob-NideiVhARvdgnQpwkp1aCbAxihiCmuoCHjmdoTwow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

On Fri, Jan 26, 2024 at 1:43 PM Nikita Malakhov <hukutoc(at)gmail(dot)com> wrote:
> It's a good idea to split a big patch into several smaller ones.
> But you've already implemented these features - you could provide them
> as sequential small patches (i.e. v13-0002-guc-context-propagation.patch and so on)

Keeping a consistent set of patches is going to be hard as any change
in the first patch would require rebasing that are likely going to
conflict. I haven't discarded the features but keeping them separated
could help with the discussion.

On Fri, Feb 2, 2024 at 1:41 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
> Hi Anthonin,
>
> I'm only now catching up on this thread. Very exciting feature!
Thanks! And thanks for the comments.

> - You're relying on various existing hooks, but it might be better to
> have the 'begin_span'/'end_span' calls directly in PostgreSQL source
> code in the right places. There are more places where spans might be
> nice, like when performing a large sort in tuplesort.c to name one
> example. Or even across every I/O; not sure how much overhead the spans
> incur and how much we'd be willing to accept. 'begin_span'/'end_span'
> could be new hooks that normally do nothing, but your extension would
> implement them.

Having information that would only be available to tracing doesn't
sound great. It probably should be added to the core instrumentation
which could be then leveraged by tracing and other systems.
One of the features I initially implemented was to track parse start
and end. However, since only the post parse hook is available, I could
only infer when the parse started. Changing the core to track parse
start would make this more easier and it could be used by EXPLAIN and
pg_stat_statements.
In the end, I've tried to match what's done in pg_stat_statements
since the core idea is the same, except pg_tracing outputs spans
instead of statistics.

> - How to pass the tracing context from the client? You went with
> SQLCommenter and others proposed a GUC. A protocol extension would make
> sense too. I can see merits to all of those. It probably would be good
> to support multiple mechanisms, but some might need core changes. It
> might be good to implement the mechanism for accepting trace context in
> core. Without the extension, we could still include the trace ID in the
> logs, for example.

I went with SQLCommenter for the initial implementation since it has
the benefit of already being implemented and working out of the box.
Jelte has an ongoing patch to add the possibility to modify GUC values
through a protocol message [1]. I've tested it and it worked well
without adding too much complexity in pg_tracing. I imagine having a
dedicated traceparent GUC variable could be also used in the core to
propagate in the logs and other places.

> - Include EXPLAIN plans in the traces.
This was already implemented in the first versions (or rather, the
planstate was translated to spans) of the patch. I've removed it in
the latest patch as it was contributing to 50% of the code and also
required a core change, namely a change in instrument.c to track the
first start of a node. Though if you were thinking of the raw EXPLAIN
ANALYZE text dumped as a span metadata, that could be doable.

> # Comments on the implementation
>
> There was discussion already on push vs pull model. Currently, you
> collect the traces in memory / files, and require a client to poll them.
> A push model is more common in applications that support tracing. If
> Postgres could connect directly to the OTEL collector, you'd need one
> fewer running component. You could keep the traces in backend-private
> memory, no need to deal with shared memory and spilling to files.

PostgreSQL metrics are currently exposed through tables like pg_stat_*
and users likely have collectors pulling those metrics (like OTEL
PostgreSQL Receiver [2]). Having this collector also pulling traces
would keep the logic in a single place and would avoid a situation
where both pull and push is used.

> 2. Start a new pgsql-hackers thread on in-core changes that would
> benefit the extension. Write patches for them. I'm thinking of the
> things I listed in the Core changes section above, but maybe there are
> others.
I already have an ongoing patch (more support for extended protocol in
psql [3] needed for tests) and plan to do others (the parse changes
I've mentioned). Though I also don't want to spread myself too thin.

[1]: https://commitfest.postgresql.org/46/4736/
[2]: https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/postgresqlreceiver/README.md
[3]: https://commitfest.postgresql.org/46/4650/

On Fri, Feb 2, 2024 at 5:42 PM Jan Katins <jasc(at)gmx(dot)net> wrote:
>
> Hi!
>
> On Fri, 2 Feb 2024 at 13:41, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>>
>> PS. Does any other DBMS implement this? Any lessons to be learned from them?
>
>
> Mysql 8.3 has open telemetrie component, so very fresh:
>
> https://dev.mysql.com/doc/refman/8.3/en/telemetry-trace-install.html
> https://blogs.oracle.com/mysql/post/mysql-telemetry-tracing-with-oci-apm
>
> Mysql ppl are already waiting for PG to catch up (in German): https://chaos.social/@isotopp/111421160719915296 :-)
>
> Best regards,
>
> Jan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2024-02-06 10:18:14 Re: Question on LWLockMode in dsa.c
Previous Message Amit Kapila 2024-02-06 10:10:49 Re: Synchronizing slots from primary to standby