Re: Auto explain after query timeout

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Gurjeet <singh(dot)gurjeet(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Auto explain after query timeout
Date: 2022-09-20 22:56:08
Message-ID: CA+TgmoYp8KRt=JqsZdfCki_z5xHwkjcpVRXfn9a34d6xiOS1yQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 20, 2022 at 5:08 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
> - A safe explain (e.g., disallow catalog access) that is potentially
> missing information.

This would be pretty useless I think, because you'd be missing all
relation names.

> - A safe way to interrupt queries such as "safe shutdown" of a node
> (e.g., a seq scan could stop returning tuples early) and allow a
> configurable buffer of time after the statement timeout before firing
> a hard abort of the query (and transaction).

This might be useful, but it seems extremely difficult to get working.
You'd not only have to design the safe shutdown mechanism itself, but
also find a way to safely engage it at the right times.

> Alternatively I wonder if it's possible (this would maybe assume no
> catalog changes in the current transaction -- or at least none that
> would be referenced by the current query) to open a new transaction
> (with the same horizon information) and duplicate the plan over to
> that transaction and run the explain there. This way you do it *after*
> the error is raised. That's some serious spit-balling -- I'm not
> saying that's doable, just trying to imagine how one might
> comprehensively address the concerns.

Doesn't work, because the new transaction's snapshot wouldn't be the
same as that of the old one. Imagine that you create a table and run a
query on it in the same transaction. Then you migrate the plan tree to
a new transaction and try to find out the table name. But in the new
transaction, that table doesn't exist: it was destroyed by the
previous rollback.

Honestly I have no very good ideas how to create the feature you want
here. I guess the only thing I can think of is to separate the EXPLAIN
process into two phases: a first phase that runs when the plan tree is
set up and gathers all of the information that we might need later,
like relation names, and then a second phase that runs later when you
want to generate the output and does nothing that can fail, or at
least no database: maybe it's allowed to allocate memory, for example.
But that sounds like a big and perhaps painful refactoring exercise,
and I can imagine that there might be reasons why it doesn't work out.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2022-09-20 23:00:26 Re: Use pg_pwritev_with_retry() instead of write() in dir_open_for_write() to avoid partial writes?
Previous Message Peter Geoghegan 2022-09-20 22:12:00 Re: Reducing the WAL overhead of freezing in VACUUM by deduplicating per-tuple freeze plans