Re: Explain XML patch v2

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Raney <raneyt(at)cecs(dot)pdx(dot)edu>
Cc: pgsql-patches(at)postgresql(dot)org, Gregory Stark <stark(at)enterprisedb(dot)com>
Subject: Re: Explain XML patch v2
Date: 2008-07-04 16:20:48
Message-ID: 1215188448.4051.225.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


On Tue, 2008-07-01 at 21:48 -0700, Tom Raney wrote:
> This is an update to my EXPLAIN XML patch submitted a few days ago.

This is an important patch and you've done well to get it this far.

I have much detailed feedback, but these are just emergent requests, now
I can see and understand what you've done.

* If the patch was implemented as an ExplainOneQuery_hook we would be
able to use this with 8.3 also, which might be interesting. So there's
no real need for this to be a patch on core.

* If I understand, the DTD option inlines the DTD into the resulting XML
document, rather than adding a schema definition?

* The DTD should be in a separate file also, so it can be referred to.
We should give that DTD a permanent URL so we can identify it across the
internet. This is the first time the project has published an XML
DTD/Schema, so we need to think through the right way to publish it. The
DTD should have a version number in it, so when this gets updated in the
future we can validate against the correct one.

* The DTD is regrettably a long, long way from where I need it to be.
The PLAN elements are all unrelated to one another, apart from their
sequence within the XML document and their "indent". That definition can
lead to XML documents that match the DTD yet would never be valid plans,
amongst other problems. For sensible analysis of the SQL we need the DTD
to match the actual structure of nodes in the executor. This requires
major DTD changes, regrettably. The "indent" comes from the nesting of
the nodes, and is not merely a visual feature of the complete plan. IMHO
it is critically important that the XML correctly conveys the structure
of the plan and not just the formatting of the current text output.
Otherwise many doors will be closed to us and this whole project wasted.
I want to be able to write xml queries to retrieve plans that contain a
merge join where both the inner and outer are merge joins, or to easily
compare the structure of two complex queries looking for differences.

* The meaning of the two time attributes is somewhat confused. It is
startuptime and totaltime, not time_start and time_end.

* It looks to me like QUALIFIER alone is not enough, since in some cases
nodes have both an index condition and a filter, for example.

* I've made a number of suggested changes below, but the DTD really
needs to follow the structures in src/include/nodes/plannodes.h
In particular you'll see the recursive structure introduced by the DTD
changes below

* EXPLAIN has been renamed PLAN
* PLAN has been renamed NODE
* COST has been renamed to ESTIMATE
* ANALYZE has been renamed to ACTUAL
* OUTPUT, SORT and QUALIFIER have been removed for clarity only

<!ELEMENT plan (estimate, runtime?)>

<!ELEMENT node (table?, index?, estimate, actual?, outerpath?, innerpath?, initpath)>
<!ATTLIST plan\n"
nodetype CDATA #REQUIRED>

<!ELEMENT outerpath (node)>
<!ELEMENT innerpath (node)>
<!ELEMENT initpath (node)>

<!ELEMENT estimate EMPTY >

<!ATTLIST estimate
startupcost CDATA #REQUIRED
totalcost CDATA #REQUIRED
rows CDATA #REQUIRED
width CDATA #REQUIRED>

<!ELEMENT actual EMPTY >
<!ATTLIST actual\n"
startuptime CDATA #REQUIRED
totaltime CDATA #REQUIRED
rows CDATA #REQUIRED
loops CDATA #REQUIRED>

* I'd much prefer to define this as a Schema, since that's a bit more
flexible in what we can do, plus we can specify datatypes.

* Based upon some of those issues, I'd suggest that further work on the
DTD/Schema should only happen when a reasonable range of plans have been
accumulated to allow full understanding of the possibilities

I'm sorry I've found so much to say about this, but don't be dissuaded.
The basic structure of the patch is there, we just need to get the
details right also, so we can take full.

--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2008-07-04 16:22:08 Re: Review: DTrace probes (merged version)
Previous Message Tom Lane 2008-07-04 16:15:03 Re: pgsql: Fix a couple of bugs in win32 shmem name generation: * Don't cut

Browse pgsql-patches by date

  From Date Subject
Next Message Simon Riggs 2008-07-04 16:22:23 Re: Sorting writes during checkpoint
Previous Message Tom Lane 2008-07-04 16:05:54 Re: Sorting writes during checkpoint