Re: nested queries vs. pg_stat_activity

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: nested queries vs. pg_stat_activity
Date: 2020-08-10 14:44:48
Message-ID: CABUevEzzwefAfJbJKvwYfJ73gp9SJd-5WAF-oNiVuY0+UjW=Qw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Aug 10, 2020 at 4:37 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> Is there a way that we can show information about nested queries in
> pg_stat_activity? It's often inconvenient for users when somebody's
> executing a function and it doesn't seem to be finishing as quickly as
> anticipated. You can't really tell where in that function things broke
> down. There are a couple of possible ways to attack this problem, but
> the one that I like best is to just try to advertise the query text of
> all the nested queries that are in progress rather than only the
> top-level query. This runs up against the problem that we have only a
> fixed-length buffer with which to work, but that doesn't seem like a
> huge problem: if the outer query fills the buffer, nested queries
> won't be advertised. If not, the first level of nested query can be
> advertised using the space remaining. If there's still space left,
> this can be repeated for multiple levels of nested queries until we
> run out of bytes. This requires some way of separating one query
> string from the next, but that seems like a pretty solvable problem.
> It also requires figuring out how this would show up in the output of
> pg_stat_activity, which I'm not quite sure about. And it might have
> performance issues too, for some use cases, but it could be an
> optional feature, so that people who don't want to pay the cost of
> updating the pg_stat_activity information more frequently do not need
> to do so.
>
> I'm curious to hear if other people agree that this is a problem, what
> they think about the above ideas for improving things, and if they've
> got any other suggestions.
>

This sounds very similar to the just-raised problem of multiple
semicolon-separated queries (see
https://www.postgresql.org/message-id/030a4123-550a-9dc1-d326-3cd5c46bcc59%40amazon.com).
They should definitely be considered in the same context, so we don't end
up creating two incompatible ideas.

Another idea around this, which I haven't really thought through, but
figured I'd throw out anyway. It doesn't have to be in pg_stat_activity, if
we can access it. E.g. we could have something like "SELECT * FROM
pg_querystack(<backend pid>)". In fact, I've also *often* wanted something
like "SELECT * FROM pg_queryhistory(<backend pid>)" to see the last 2-3
things it did *before* reaching the current point, as a way of identifying
where in an application this happened. That need has led to really ugly
hacks like https://github.com/mhagander/pg_commandhistory.

This is not information you'd need all that often, so I think it'd be
perfectly reasonable to say that you pay a higher price when you get it, if
we can keep down the cost of keeping it updated. Of course, this reduces it
down to "how can we get it". If each backend keeps this information
locally, we could send it a signal to say something like "dump what you
have now into shared memory over here" and read it from there -- which
would be cleaner than my hack which dumps it to the log.

I'm sure I'm missing many things in that, but as a wild idea :)

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2020-08-10 15:07:02 Re: Add information to rm_redo_error_callback()
Previous Message Tom Lane 2020-08-10 14:44:21 Re: Issue with cancel_before_shmem_exit while searching to remove a particular registered exit callbacks