Skip site navigation (1) Skip section navigation (2)

Re: Access statistics

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Jan Wieck <JanWieck(at)Yahoo(dot)com>, PostgreSQL HACKERS <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Access statistics
Date: 2001-06-01 13:31:05
Message-ID: 200106011331.f51DV5P02078@jupiter.us.greatbridge.com (view raw or flat)
Thread:
Lists: pgsql-hackers
Peter Eisentraut wrote:
> Jan Wieck writes:
>
> >     In  the  next  couple of hours (at least tomorrow) I would be
> >     ready to commit the backend changes  for  table-/index-access
> >     statistics and current backend activity views.
> >
> >     Should  I  apply  the patches or provide a separate patch for
> >     review first?
>
> Maybe you could describe what it's going to do and how it's going to work.

    Real  programmers  don't comment - if it was hard to write it
    should be hard to read :-)

    So outing myself not beeing a *real programmer*, this is what
    I have so far:

    * On startup the postmaster creates an INET domain UDP socket
      and bind(2)'s it to localhost:0, meaning  the  kernel  will
      assign  a  yet unused, unprivileged port that could be seen
      with getsockaddr(2).

      It then starts two background processes  of  which  one  is
      simply a wraparound buffer doing recvfrom(2) on the socket,
      checking that the source address of the received packets is
      the  sockets  own  address (!) and forwarding approved ones
      over a pipe to the second one, discribed later.

    * Backends call some collector functions  at  various  places
      now  (these  will  finally  be macros), that count up table
      scans, tuples returned by scans,  buffer  fetches/hits  and
      the like. At the beginning of a statement the backends send
      a message telling the first couple of hundred bytes of  the
      querystring  and  after  the statement is done (just before
      getting ready for the next command) they send the collected
      access numbers.

      Tables,  indexes etc. in these statistics are identified by
      OID, so the data doesn't tell much so far.

    * The second  background  process  fired  by  the  postmaster
      collects  these  numbers  into hashtables and as long as it
      receives messages, it'll write out a summary file every 500
      or  so  milliseconds,  telling a snapshot of current stats.
      Using select(2) with timeouts ensures that a complete  idle
      DB  instance not to waste a single CPU cycle or IO to write
      these snapshots.

      On startup it tries to read the last snapshot file  in,  so
      the collected statistics survive a postmaster restart.

      Vacuum  reads  the  file too and sends bulk delete messages
      for  objects  that  are  gone.  So  the  stats  don't  grow
      infinitely.

    * A  bunch  of  new  builtin  functions  gain  access  to the
      snapshot file. At first call  of  one  of  these  functions
      during  a  transaction,  the  backend will read the current
      file and return the numbers from in memory then.

      Based on these functions a couple of views can  tell  these
      collected  stats.  Information  from  the  databases system
      catalog is of course required to identify  the  objects  in
      the  stats,  but  I think those informations should only be
      visible to someone who identified herself  as  a  valid  DB
      user anyway.

      The  visibility  of  querystrings  (this  info is available
      cross DB) is restricted to DB superusers.

    There has been discussion already about  using  an  INET  vs.
    UNIX  UDP  socket for the communication. At least for Linux I
    found INET to be the most effective way of communication. And
    for  security  concerns: If someone else than root can really
    send packets to that  socket  that  show  up  with  a  source
    address  of  127.0.0.1:n,  where  n  is a portnumber actually
    occupied by your own socket, be sure you'll have more  severe
    problems than modified access statistics.

    The views should be considered examples. The final naming and
    layout is subject for discussion.


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck(at)Yahoo(dot)com #



_________________________________________________________
Do You Yahoo!?
Get your free @yahoo.com address at http://mail.yahoo.com


In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2001-06-01 13:49:25
Subject: Re: ERROR: cache lookup for proc 43030134 failed
Previous:From: Michael SamuelDate: 2001-06-01 13:11:13
Subject: Re: Re: Support for %TYPE in CREATE FUNCTION

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group