On Sat, Jan 21, 2012 at 18:02, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
> On 20 Leden 2012, 13:23, Magnus Hagander wrote:
>> On Tue, Jan 17, 2012 at 21:39, Tomas Vondra <tv(at)fuzzy(dot)cz> wrote:
>>> On 20.12.2011 19:59, Tomas Vondra wrote:
>>>> On 20.12.2011 11:20, Magnus Hagander wrote:
>>>>> 2011/12/20 Tomas Vondra <tv(at)fuzzy(dot)cz>:
>>>>>> I haven't updated the docs yet - let's see if the patch is acceptable
>>>>>> all first.
>>>>> Again, without having reviewed the code, this looks like a feature
>>>>> we'd want, so please add some docs, and then submit it for the next
>>>> I've added the docs (see the attachment) and rebased to current head.
>>> Fixed a failing regression test (check of pg_stat_database structure).
>> I'm wondering if this (and also my deadlocks stats patch that's int he
>> queue) should instead of inventing new pgstats messages, add fields to
>> the tabstat message. It sounds like that one is just for tables, but
>> it's already the one collecting info about commits and rollbacks, and
>> it's already sent on every commit.
> Hmmm, I'm not against that, but I'd recommend changing the message name to
> something that reflects the reality. If it's not just about table
> statistics, it should not be named 'tabstats' IMHO. Or maybe split that
> into two messages, both sent at the commit time.
Yes, renaming it might be a good idea. If you split it into two
messages that would defeat much of the point.
Though I just looked at the tabstat code again, and we already split
that message up at regular intervals. Which would make it quite weird
to have global counters in it as well.
But instead of there, perhaps we need a general "non table, but more
than one type of data" message sent out at the same time. There is
other stuff in the queue for it.
I'm not convinced either way - I'm not against the original way in
your patch either. I just wanted to throw the idea out there, and was
hoping somebody else would have an opinion too :-)
> I do like the idea of not sending the message for each temp file, though.
> One thing that worries me are long running transactions (think about a
> batch process that runs for several hours within a single transaction). By
> sending the data only at the commit, such transactions would not be
> accounted properly. So I'd suggest sending the message either at commit
By that argument they are *already* not accounted properly, because
the "number of rows" and those counters are wrong. By sending the temp
file data in the middle of the transaction, you won't be able to
correlate those numbers with the temp file usage.
I'm not saying the other usecase isn't more common, but whichever way
you do it, it's going to get inconsistent with *something*.
> time or after collecting enough data (increment a counter whenever the
> struct is updated and send a message when the counter >= N for a
> reasonable value of N, say 20). But maybe it already works that way - I
> haven't checked the current 'tabstat' implementation.
No, tabstat is sent at transaction end only.
>> Adding two fields to that one would add some extra bytes on every
>> send, but I wonder if that woudl ever affect performance, given the
>> total size of the packet? And it would certainly be lower overhead in
>> the cases that there *have* been temp tables used.
> It's not about temp tables, it's about temp files. Which IMHO implies that
> there would be exactly 0.000001% performance difference because temporary
> files are quite expensive.
Bad choice of words, I meant temp files, and was thinking temp files
the whole way :-)
In response to
pgsql-hackers by date
|Next:||From: Dimitri Fontaine||Date: 2012-01-21 17:20:57|
|Subject: Re: Finer Extension dependencies|
|Previous:||From: Tomas Vondra||Date: 2012-01-21 17:02:01|
|Subject: Re: PATCH: tracking temp files in pg_stat_database|