Re: RFC: Logging plan of the running query

From: James Coleman <jtc331(at)gmail(dot)com>
To: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Greg Stark <stark(at)mit(dot)edu>, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, david(dot)christensen(at)crunchydata(dot)com, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Subject: Re: RFC: Logging plan of the running query
Date: 2023-06-05 18:26:25
Message-ID: CAAaqYe8LXVXQhYy3yT0QOHUymdM=uha0dJ0=BEPzVAx2nG1gsw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 5, 2023 at 4:30 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
>
> On 2023-06-03 02:51, James Coleman wrote:
> > Hello,
> >
> > Thanks for working on this patch!

Sure thing! I'm *very interested* in seeing this available, and I
think it paves the way for some additional features later on...

> > On Thu, Dec 8, 2022 at 12:10 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>
> ...
> > To put it positively: we believe that, for example, catalog accesses
> > inside CHECK_FOR_INTERRUPTS() -- assuming that the CFI call is inside
> > an existing valid transaction/query state, as it would be for this
> > patch -- are safe. If there were problems, then those problems are
> > likely bugs we already have in other CFI cases.
>
> Thanks a lot for the discussion and sharing it!
> I really appreciate it.
>
> BTW I'm not sure whether all the CFI are called in valid transaction,
> do you think we should check each of them?

I kicked off the regressions tests with a call to
ProcessLogQueryPlanInterrupt() in every single CHECK_FOR_INTERRUPTS()
call. Several hours and 52 GB of logs later I have confirmed that
(with the attached revision) at the very least the regression test
suite can't trigger any kind of failures regardless of when we trigger
this. The existing code in the patch for only running the explain when
there's an active query handling that.

> > Another concern Robert raised may apply here: what if a person tries
> > to cancel a query when we're explaining? I believe that's a reasonable
> > question to ask, but I believe it's a trade-off that's worth making
> > for the (significant) introspection benefits this patch would provide.
>
> Is the concern here limited to the case where explain code goes crazy
> as Robert pointed out?
> If so, this may be a trade-off worth doing.
> I am a little concerned about whether there will be cases where the
> explain code is not problematic but just takes long time.

The explain code could take a long time in some rare cases (e.g., we
discovered a bug a few years back with the planning code that actually
descends an index to find the max value), but I think the trade-off is
worth it.

> > On to the patch itself: overall I think it looks like it's in pretty
> > good shape. I also noticed we don't have any tests (I assume it'd have
> > to be TAP tests) of the actual output happening, and I think it would
> > be worth adding that.
>
> Checking the log output may be better, but I didn't add it since there
> is only a little content that can be checked, and similar function
> pg_log_backend_memory_contexts() does not do such type of tests.

Fair enough. I still think that would be an improvement here, but
others could also weigh in.

> > Are you interested in re-opening this patch? I'd be happy to provide
> > further review and help to try to push this along.
> Sure, I'm going to re-open this.

I've attached v27. The important change here in 0001 is that it
guarantees the interrupt handler is re-entrant, since that was a bug
exposed by my testing. I've also included 0002 which is only meant for
testing (it attempts to log in the plan in every
CHECK_FOR_INTERRUPTS() call).

Regards,
James

Attachment Content-Type Size
v27-0001-Add-function-to-log-the-plan-of-the-query.patch application/octet-stream 25.4 KB
v27-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch application/octet-stream 891 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2023-06-05 18:30:28 Re: Let's make PostgreSQL multi-threaded
Previous Message Alexander Pyhalov 2023-06-05 18:14:46 Re: Partial aggregates pushdown