Re: Less than ideal error reporting in pg_stat_statements

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Marti Raudsepp <marti(at)juffo(dot)org>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Less than ideal error reporting in pg_stat_statements
Date: 2015-09-29 20:56:37
Message-ID: CAM3SWZRxEEv8ZuSqDBDNQn3m2Ac=ABS6=nvG09Y56LLKRVN9fA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 25, 2015 at 8:51 AM, Marti Raudsepp <marti(at)juffo(dot)org> wrote:
> I've also been seeing lots of log messages saying "LOG: out of
> memory" on a server that's hosting development databases. I put off
> debugging this until now because it didn't seem to have any adverse
> effects on the system.

That's unfortunate.

> It's very frustrating when debugging aides cause further problems on a
> system. If the in-line compaction doesn't materialize (or it's decided
> not to backport it), I would propose instead to add a test to
> pgss_store() to avoid growing the file beyond MaxAlloc (or perhaps
> even a lower limit). Surely dropping some statistics is better than
> this pathology.

I heard a customer complaint today that seems similar. A Heroku
customer attempted a migration from MySQL to PostgreSQL in this
manner:

mysqldump | psql

This at least worked well enough to cause problems for
pg_stat_statements (some queries were not rejected by the parser, I
suppose).

While I'm opposed to arbitrary limits for tools like
pg_stat_statements, I think the following defensive measure might be
useful on top of better OOM defenses:

Test for query text length within pgss_store() where a pgssJumbleState
is passed by caller (the post-parse-analysis hook caller -- not
executor hook caller that has query costs to store). If it is greater
than, say, 10 * Max(ASSUMED_MEDIAN_INIT, pgss->cur_median_usage), do
not bother to normalize the query text, or store anything at all.
Simply return.

Any entry we create at that point will be a sticky entry (pending
actually completing execution without the entry being evicted), and it
doesn't seem worth worrying about normalizing very large query texts,
which tend to be qualitatively similar to utility statements from the
user's perspective anyway. Besides, query text normalization always
occurred on a best-effort basis. It's not very uncommon for
pg_stat_statements to fail to normalize query texts today for obscure
reasons.

This would avoid storing very large query texts again and again when a
very large DML statement repeatedly fails (e.g. due to a data
integration task that can run into duplicate violations) and is
repeatedly rerun with tweaks. Maybe there is a substantively distinct
table in each case, because a CREATE TABLE is performed as part of the
same transaction, so each failed query gets a new pg_stat_statements
entry, and a new, large query text.

I should probably also assume that sticky entries have a generic
length (existing pgss->mean_query_len) for the purposes of
accumulating query text length within entry_dealloc(). They should not
get to contribute to median query length (which throttles garbage
collection to prevent thrashing).

Anyone have an opinion on that?

--
Peter Geoghegan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2015-09-29 20:59:20 Re: On-demand running query plans using auto_explain and signals
Previous Message Joe Conway 2015-09-29 20:53:33 Re: Idea for improving buildfarm robustness