Re: proposal: simple query profile and tracing API

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PAscal L <legrand_legrand(at)hotmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: proposal: simple query profile and tracing API
Date: 2018-11-14 10:26:21
Message-ID: CAFj8pRDibfJGy8VhGjqR01ji6V95CMSTerYs4zWq8OgqjTCc1w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

út 13. 11. 2018 v 20:38 odesílatel Tomas Vondra <
tomas(dot)vondra(at)2ndquadrant(dot)com> napsal:

> On Tue, 2018-11-13 at 13:55 +0100, Pavel Stehule wrote:
> > út 13. 11. 2018 v 13:12 odesílatel legrand legrand <
> > legrand_legrand(at)hotmail(dot)com> napsal:
> >
> > > Hello Pavel,
> > >
> > > What about using wait events and a trigger on pg_stat_activity ?
> > >
> >
> > pg_stat_activity should not to show fresh data. Using
> > pg_stat_activity can be too expensive for fast queries
> >
>
> More importantly, how would you create a trigger on pg_stat_activity,
> considering it's a system view backed by SRF?
>
> > > ...
> > > An other solution: a customized version of pgsentinel (for high
> > > frequency sampling):
> > >
> >
> > I don't believe to sampling method - I talk about less than 10ms
> > queries, I would to see a 2-3ms planning time, 2-5ms waitings - and
> > it means sampling aboy 2ms, what is expensive
> >
>
> You're quietly assuming that whatever alternative solution you end up
> inventing will be cheaper than this sampling. Which is going to be
> hard, if you want to do that for every execution of even the shortest
> queries. I'd say that's doomed to fail.
>
>
My idea is very simple.

1. continual collect of data - planning start, execution start, waiting
start, waiting end, query end

2. run a some callback function after query is finished. Collected data
will be passed there.

I think so anybody can do some different with these data. Sometimes only
sum can be ok, sometimes you need to increment some sorted counts,
sometimes you need to store these data for median or percentil calculation.

I think so it can be very simple and fast, because you should to work with
just one metrics vector.

> Moreover, the sampling does not need to catch every query execution.
> The idea is to do it "just often enough" for some desired accuracy. For
> example you might pick 10ms interval - it will hit even shorter queries
> if they are executed often enough (and if they're not, who cares about
> them?). And given the sample percentages and total time, you can do
> some estimates for each query / phase.
>

With 10ms sampling there will not be big error, but 10ms sampling can
utilize CPU too high. Now I don't see a advantage of sampling based method
with more complex processing (because you should to process more rows)
against to session based processing.

>
>
> regards
>
> --
> Tomas Vondra http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-11-14 10:28:32 Re: speeding up planning with partitions
Previous Message Dilip Kumar 2018-11-14 10:18:28 Re: Undo logs