Re: Timing events WIP v1

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: Peter Geoghegan <peter(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Timing events WIP v1
Date: 2013-01-14 20:37:40
Message-ID: 50F46C94.5030906@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 1/14/13 11:19 AM, Peter Geoghegan wrote:
> I noticed that when !log_checkpoints, control never reaches the site
> where the hook is called, and thus the checkpoint info is not stored.
> Is that the intended behaviour of the patch?

I was aware and considered it a defensible situation--that turning off
checkpoint logging takes that data out everywhere. But it was not
necessarily the right thing to do.

> Currently, explain.c
> manages to generate JSON representations of plans with very little
> fuss, and without using any of the JSON datatype stuff. Doing this
> with hstore is just way too controversial, and not particularly worth
> it, IMHO.

I wasn't optimistic after seeing the number of bugs that scurried out
when the hstore rock was turned over for this job. On the
implementation side, the next round of code I've been playing with here
has struggled with the problem of rendering to strings earlier than I'd
like. I'd like to delay that as long as possible; certainly not do it
during storage, and preferably it only happens when someone asks for the
timing event.

> With a "where event_type = x" in the query predicate, the
> JSON datums would have predictable, consistent structure, facilitating
> machine reading and aggregation.

Filtering on a range of timestamps or on the serial number field is the
main thing that I imagined, as something that should limit results
before even producing tuples. The expected and important case where
someone wants "all timing events after #123" after persisting #123 to
disk, I'd like that to be efficient. All of the fields I'd want to see
filtering on are part of the fixed set of columns every entry will have.

To summarize, your suggestion is to build an in-memory structure capable
of holding the timing event data. The Datum approach will be used to
cope with different types of events having different underlying types.
The output format for queries against this data set will be JSON,
rendered directly from the structure similarly to how EXPLAIN (FORMAT
JSON) outputs query trees. The columns every row contains, like a
serial number, timestamp, and pid, can be filtered on by something
operating at the query executor level. Doing something useful with the
more generic, "dynamic schema" parts will likely require parsing the
JSON output.

Those are all great ideas I think people could live with.

It looks to me like the hook definition itself would need the entire
data structure defined, and known to work, before its API could be
nailed down. I was hoping that we might get a hook for diverting this
data committed into 9.3 even if the full extension to expose it wasn't
nailed down. That was based on similarity to the generic logging hook
that went into 9.2. This new implementation idea reminds me more of the
query length decoration needed for normalized pg_stat_statements
though--something that wasn't easy to just extract out from the consumer
at all.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2013-01-14 20:49:23 Re: fix SQL example syntax in file comment
Previous Message Tom Lane 2013-01-14 20:14:41 Re: Validation in to_date()