Re: New EXPLAIN ANALYZE statement

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-patches(at)postgresql(dot)org
Subject: Re: New EXPLAIN ANALYZE statement
Date: 2001-07-21 01:45:42
Message-ID: 20010721114542.A8518@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-patches

On Fri, Jul 20, 2001 at 01:39:34PM -0400, Tom Lane wrote:
> I'm still unconvinced that this approach will yield values that can
> meaningfully be compared to the planner's cost estimates. However,
> the thing I *really* object to about this patch is that the statistics-
> collection overhead is paid by every query whether it's being EXPLAINed
> or not. Kindly set it up so that that's not true. (One way to do that
> is to have EXPLAIN be responsible for traversing the tree and attaching
> instrumentation structs to the nodes; then at runtime, the work is done
> iff the struct links are not NULL.)

Why do you think the results cannot be compared to the estimates? The
estimates are after all an estimate of how long it would take. The code
already handles having NULLs in the structure so it doesn't do anything. I
was thinking if just having a flag in the top level query node and use that
to decide whether to allocate the structure in ExecInitNode.

> I think this is wrongheaded; one of the things that might be interesting
> is to know how much of the runtime is actually spent in the executor
> toplevel, doing the update/delete/trigger firing/whatever. And if your
> query is calling expensive functions, why would you not want to know
> that? The correct approach is to indeed do the query, full up, as-is.
> (And you should probably include the end-to-end time, user, system and
> wall-clock, in the printout.) What's needed is to design the user
> interface so that it does not surprise people that the query is
> executed. This suggests to me that it should not be called EXPLAIN,
> but something else. (No, not ANALYZE, that's taken.)

Unfortunatly (as far as I can tell) calls to functions don't have nodes in
the tree so I don't see anywhere to store the actual data. Maybe there is
some other non-shared structure I can use. I'll look.

What you are suggesting would make the explain output very much more
verbose. If that's what you're insterested in. I'm not sure adding all those
other time measurements would be particularly helpful. I agree with the
renaming but no-one seems to have any suggestions. Maybe TIME [query], like
at a command prompt.

> Random other comments:
>
> > + InstrStopNode( node->instrument );
> > +
> > + if( TupIsNull(result) )
> > + InstrEndLoop( node->instrument );
>
> I think this isn't gonna work right for Group nodes.

I noticed that Group node act strange, the tuplecount becomes the whole
number of tuples involved and the loop count is the number of groups. I was
going to look into it but am not convinced it is wrong as is.

> > + int tuplecount; /* Tuples so far this loop */
>
> Call me silly, but I think the tuple and iteration counts should be
> doubles, not ints, to avoid overflow issues.

I'd prefer a 64-bit int actually. Floating point is accurate but no
precise.

> > + bzero( i, sizeof(Instrumentation) );
>
> Don't use bzero; use the ANSI-standard routines (memset, etc).

OK.

> > +InstrStartNode( Instrumentation *i ) /* When we enter the node */
> > +{
> > + if( !i )
> > + return;
>
> This is just a personal thing, perhaps, but I *never* use a name like
> "i" for anything except loop counters. A struct argument ought to have
> a less anonymous name.

s/i/instr/ then

> > + while( i->counter.tv_usec < 0 )
> > + {
> > + i->counter.tv_usec += 1000000;
> > + i->counter.tv_sec--;
> > + }
>
> Don't you need to also reduce tv_usec to less than 1000000? Else you
> risk overflow of the usec field over successive executions.

I think the way the calculation is written, tv_usec can't be greater than
1000000.

> > + if( !timerisset( &i->counter ) ) /* Skip if nothing has happend */
>
> Are these "timerisset" and "timerclear" thingies portable? I can't find
> anything about them in the man pages. Since they're not used anywhere
> in Postgres at the moment, I'd just as soon not see you add a new
> platform dependency. Just write out the clearing/testing of tv_sec and
> tv_usec instead --- it's not like your code will work if those fields
> don't exist with those names...

I'm not sure. On my system (Debian Linux) they are described in the manpage
for gettimeofday. They are just one line #defines, not a major drama. I
can copy them in if you like.

> A general rule when dealing with system-defined stuff is to look
> around and see how similar stuff is coded elsewhere in the Postgres
> backend, and do it the same way. That way you can take advantage of
> literally years' worth of portability knowledge that we have
> accumulated, rather than repeating old mistakes that we have fixed.

Hmm, ok. gettimeofday is used elsewhere. There are no other system calls
involved.

--
Martijn van Oosterhout <kleptog(at)svana(dot)org>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

In response to

Browse pgsql-patches by date

  From Date Subject
Next Message Martijn van Oosterhout 2001-07-21 01:58:34 Re: New EXPLAIN ANALYZE statement
Previous Message Bruce Momjian 2001-07-21 00:31:02 pg_hba.conf caching