Re: proposal - log_full_scan

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: proposal - log_full_scan
Date: 2021-04-17 15:22:59
Message-ID: CAFj8pRCMOowHzop2Pt1N=MNRnht++vYbpu1Zv+t4JGb4gu4w8Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

so 17. 4. 2021 v 17:09 odesílatel Justin Pryzby <pryzby(at)telsasoft(dot)com>
napsal:

> On Sat, Apr 17, 2021 at 04:36:52PM +0200, Pavel Stehule wrote:
> > today I worked on postgres's server used for critical service. Because
> the
> > application is very specific, we had to do final tuning on production
> > server. I fix lot of queries, but I am not able to detect fast queries
> that
> > does full scan of middle size tables - to 1M rows. Surely I wouldn't log
> > all queries. Now, there are these queries with freq 10 per sec.
> >
> > Can be nice to have a possibility to set a log of queries that do full
> > scan and read more tuples than is specified limit or that does full scan
> of
> > specified tables.
> >
> > What do you think about the proposed feature?
>
> Are you able to use auto_explain with auto_explain.log_min_duration ?
>

Unfortunately, I cannot use it. This server executes 5K queries per
seconds, and I am afraid to decrease log_min_duration.

The logs are forwarded to the network and last time, when users played with
it, then they had problems with the network.

I am in a situation where I know there are queries faster than 100ms, I see
so there should be fullscans from pg_stat_user_tables, but I don't see the
queries.

The fullscan of this table needs about 30ms and has 200K rows. So
decreasing log_min_duration to this value is very risky.

> Then you can search for query logs with
> message ~ 'Seq Scan .* \(actual time=[.0-9]* rows=[0-9]{6,} loops=[0-9]*)'
>
> Or can you use pg_stat_all_tables.seq_scan ?
>

I use pg_stat_all_tables.seq_scan and I see seq scans there. But I need to
know the related queries.

> But it seems to me that filtering on the duration would be both a more
> important criteria and a more general one, than "seq scan with number of
> rows".
>
> | (split_part(message, ' ', 2)::float/1000 AS duration ..) WHERE
> duration>2222;
>
> --
> Justin
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2021-04-17 16:31:11 Re: multi-install PostgresNode fails with older postgres versions
Previous Message Justin Pryzby 2021-04-17 15:09:31 Re: proposal - log_full_scan