Re: Less than ideal error reporting in pg_stat_statements

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Less than ideal error reporting in pg_stat_statements
Date: 2015-10-04 22:10:23
Message-ID: 30253.1443996623@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Geoghegan <pg(at)heroku(dot)com> writes:
> On Sun, Oct 4, 2015 at 1:01 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Hm. The problem I've got with this is that then mean_query_len means
>> something significantly different after entry_dealloc than it does
>> after gc_texts.
>>
>> I'd be okay with changing *both* of those functions to ignore sticky
>> entries in the calculation, if that seems reasonable to you.

> That seems perfectly reasonable, yes. Should I leave that to you?

After a closer look I decided that wasn't reasonable at all. Discounting
sticky texts would then mean that after a GC cycle, we might still think
the query texts file is bloated and issue another GC request, which of
course would not shrink the file, so that we'd be doing GC cycles every
time we added a new query. The mean query len recorded by gc_qtexts()
*has to* match the mean length of what it actually put in the file, not
the mean length of what we might wish it would put in the file.

By the same token, I'm back to believing that it's fundamentally bogus for
entry_dealloc() to compute mean_query_len that way. The most likely
result of that is useless GC cycles. The only thing that will actually
free memory when you've got a lot of dead sticky entries is getting rid of
the sticky hashtable entries, and the only way to do that is to wait for
entry_dealloc() to get rid of 'em. You were unenthused about making that
removal more aggressive, which is fine, but you can't have it both ways.

It does strike me that when we do get rid of the sticky entries, cleanup
of the texts file might lag a bit longer than it needs to because
mean_query_len is computed before not after deleting whatever entries
we're going to delete. On average, that shouldn't matter ... but if we
are tossing a bunch of dead sticky entries, maybe they would have a higher
mean length than the rest? Not sure about it. I put a comment about
this into entry_dealloc() for the moment, but we can revisit whether it
is worth adding code/cycles to get a more up-to-date mean length.

Anyway, I've committed the aspects of this that we're agreed on.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2015-10-04 22:16:34 Re: Less than ideal error reporting in pg_stat_statements
Previous Message Marko Tiikkaja 2015-10-04 22:08:49 ALTER TABLE behind-the-scenes effects' CONTEXT