Re: POC: Extension for adding distributed tracing - pg_tracing

From: Anthonin Bonnefoy <anthonin(dot)bonnefoy(at)datadoghq(dot)com>
To: Nikita Malakhov <hukutoc(at)gmail(dot)com>
Cc: Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: Extension for adding distributed tracing - pg_tracing
Date: 2023-08-09 08:34:41
Message-ID: CAO6_XqpDGFw=HRW_vLTk+wSjYu1FchB8Ab3=mUZYZj_P0htQLA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi!

> 1) query_id added so span to be able to join it with pg_stat_activity and
pg_stat_statements;
Sounds good, I've added your changes with my code.

> 2) table for storing spans added, to flush spans buffer
I'm not sure about this. It means that this is something that would only be
available on primary as replicas won't be able
to write data in the table. It will also make version updates and
migrations much more complex and I haven't seen a similar
pattern on other extensions.

> 3) added setter function for sampling_rate GUC to tweak it on-the-fly
without restart
ok, I've added this in my branch.

On my side, I've made the following changes:
1) All spans are now kept in palloced buffers and only added during
end_tracing. This way, we limit the shared_spans lock.
2) I've added a pg_tracing.drop_on_full_buffer parameter to drop all spans
when the buffer is full. This could be useful to always keep
the latest spans when the consuming app is not fast enough. This is also
useful for testing.
3) I'm testing more complex queries. Most of my previous tests were using
simple query protocol but extended protocol introduces
differences that break some assumptions I did. For example, with multi
statement transaction like
BEGIN;
SELECT 1;
SELECT 2;
The parse of SELECT 2 will happen before the ExecutorEnd (and the
end_tracing) of SELECT 1. For now, I'm skipping the post parse
hook if we still have an ongoing tracing.
I've also started running https://github.com/anse1/sqlsmith on a db with
full sample and it's currently failing some assertions and I'm
working to fix those.

On Thu, Aug 3, 2023 at 9:13 PM Nikita Malakhov <hukutoc(at)gmail(dot)com> wrote:

> Hi!
>
> Please check some suggested improvements -
> 1) query_id added so span to be able to join it with pg_stat_activity and
> pg_stat_statements;
> 2) table for storing spans added, to flush spans buffer, for maintenance
> reasons - to keep track of spans,
> with SQL function that flushes buffer into table instead of recordset;
> 3) added setter function for sampling_rate GUC to tweak it on-the-fly
> without restart.
>
> --
> Regards,
> Nikita Malakhov
> Postgres Professional
> The Russian Postgres Company
> https://postgrespro.ru/
>

Attachment Content-Type Size
pg-tracing-v3.patch application/octet-stream 141.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2023-08-09 08:41:12 Re: Row pattern recognition
Previous Message Peter Eisentraut 2023-08-09 08:26:38 Re: Adding a pg_servername() function