Re: proposal - log_full_scan

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Daniel Gustafsson <daniel(at)yesql(dot)se>
Cc: Julien Rouhaud <rjuju123(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: proposal - log_full_scan
Date: 2021-07-06 16:14:23
Message-ID: CAFj8pRCyJGpYTU3uLkedhuL2TF2yVOkhZvfGwdE0uk3tT_KcBQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

út 6. 7. 2021 v 16:07 odesílatel Daniel Gustafsson <daniel(at)yesql(dot)se> napsal:

> Looking at this I like the idea in principle, but I'm not convinced that
> auto_explain is the right tool for this. auto_explain is for identifying
> slow
> queries, and what you are proposing is to identify queries with a certain
> "shape" (for lack of a better term) even if they aren't slow as per the
> log_min_duration setting. If log_min_duration is deemed to crude due to
> query
> volume then sample_rate is the tool. If sample_rate is also discarded,
> then
> pg_stat_statements seems a better option.
>

I don't think so pg_stat_statements can be used - a) it doesn't check
execution plan, so this feature can have big overhead against current
pg_stat_statements, that works just with AST, b) pg_stat_statements has one
entry per AST - but this can be problem on execution plan level, and this
is out of perspective of pg_stat_statements.

>
> Also, why just sequential scans (apart from it being this specific
> usecase)?
> If the idea is to track aspects of execution which are deemed slow, then
> tracking for example spills etc would be just as valid. Do you have
> thoughts
> on that?
>

Yes, I thought about it more, and sometimes bitmap index scans are
problematic too, index scans in nested loops can be a problem too.

For my last customer I had to detect queries with a large bitmap index
scan. I can do it with a combination of pg_stat_statements and log
checking, but this work is not very friendly.

My current idea is to have some extension that can be tran for generally
specified executor nodes.

Sometimes I can say - I need to know all queries that does seq scan over
tabx where tuples processed > N. In other cases can be interesting to know
the queries that uses index x for bitmap index scan,

>
> That being said, a few comments on the patch:
>
> - (auto_explain_log_min_duration >= 0 && \
> + ((auto_explain_log_min_duration >= 0 || auto_explain_log_seqscan
> != -1) && \
> Is there a reason to not follow the existing code and check for >= 0?
>
> + DefineCustomIntVariable("auto_explain.log_seqscan",
> It's only a matter of time before another node is proposed for logging, and
> then we'll be stuck adding log_XXXnode GUCs. Is there a more future-proof
> way
> to do this?
>
> + "Sets the minimum tuples produced by sequantial scans which plans
> will be logged",
> s/sequantial/sequential/
>
> - es->analyze = (queryDesc->instrument_options &&
> auto_explain_log_analyze);
> + es->analyze = (queryDesc->instrument_options &&
> (auto_explain_log_analyze || auto_explain_log_seqscan != -1));
> Turning on ANALYZE when log_analyze isn't set to True is a no-no IMO.
>
> + * Colllect relations where log_seqscan limit was exceeded
> s/Colllect/Collect/
>
> + if (*relnames.data != '\0')
> + appendStringInfoString(&relnames, ",");
> This should use appendStringInfoChar instead.
>
> + (errmsg("duration: %.3f ms, over limit seqscans: %s, plan:\n%s",
> The "over limit" part is superfluous since it otherwise wouldn't be
> logged. If
> we're prefixing something wouldn't it be more helpful to include the limit,
> like: "seqscans >= %d tuples returned:". I'm not a fan of "seqscans" but
> spelling it out is also quite verbose and this is grep-able.
>
> Documentation and tests are also missing
>

Unfortunately, this idea is not well prepared. My patch was a proof concept
- but I think so it is not a good start point. Maybe it needs some tracing
API on executor level and some tool like "perf top", but for executor. Post
execution analysis is not a good direction with big overhead, and mainly it
is not friendly in critical situations. I need some handy tool like perf,
but for executor nodes. I don't know how to do it effectively.

Thank you for your review and for your time, but I think it is better to
remove this patch from commit fest. I have no idea how to design this
feature well :-/

Regards

Pavel

>
> --
> Daniel Gustafsson https://vmware.com/
>
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-07-06 16:37:38 Re: logical replication worker accesses catalogs in error context callback
Previous Message Alvaro Herrera 2021-07-06 15:54:29 Re: Refactor "mutually exclusive options" error reporting code in parse_subscription_options