Re: machine-readable pg_controldata?

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Greg Smith <greg(at)2ndquadrant(dot)com>
Cc: Magnus Hagander <magnus(at)hagander(dot)net>, Josh Berkus <josh(at)agliodbs(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: machine-readable pg_controldata?
Date: 2010-03-05 15:13:38
Message-ID: 4B911FA2.7080003@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Smith wrote:
> pg_controldata itself just reads the file in directly and dumps the
> data. There is a copy of it kept around all the time in shared memory
> though (ControlFile in xlog.c), protected by a LWLock. At a high level
> you can imagine a new function in xlog.c that acquires that lock, copies
> the block into a space the backend allocated for saving it, releases the
> lock, and then returns the whole structure. Then just wrap some number
> of superuser-only UDFs around it (I'd guess nobody wants regular ones
> able to hold a lock on ControlFile) and you've exposed the results to
> user-space.
>
> Two questions before I'd volunteer to write that:
>
> 1) How do you handle the situation where the pg_controldata is invalid?
> "Not read in yet" and "CRC is bad" are the two most obvious ones that
> can happen.

If the data in pg_control are invalid, the database won't start up, so
by the time you're running the user-defined-functions the data really
ought be valid.

> 2) While it's easy to say "I only want one or two of these values" and
> expose a whole set of UDFs to grab them individually (perhaps wrapping
> into a system view via that popular approach), I am concerned that
> people are going to call any single-value versions provided one at a
> time and get an inconsistent set. I think the only reasonable interface
> to this would not return a single field, it would pop out all of them so
> you got a matching set from the point in time the lock was held. And if
> that's the case, I'm not sure of the most reasonable UI is. Just return
> a whole row with a column for each field in the file, and then people
> can select out just the ones they want? (That's probably the right
> one)

Yeah, one column for field seems ok to me.

Which fields do you want to expose? Perhaps it would make sense to split
the functionality in a few user-defined functions: one to return static
information about the architecture and compilation options (alignment,
32-bin vs 64 bit, block sizes, etc.) one to return all the fields
regarding latest checkpoint, plus other functions for the rest that are
needed.

The REDO location of last checkpoint might deserve a function of its
own, like we have pg_last_xlog_replay_location() and
pg_last_xlog_receive_location().

> Have re-raised these concerns to myself, this is usually the point in
> this exercise where I go "screw it, I'll just parse pg_controldata again
> instead" and do that instead. This is happening so much lately that I
> think Josh's suggestion it's just unworkable to keep going via that
> model forever has merit though. I find it hard to mark this 9.0
> territory though, given the data is not actually difficult to grab--and
> that trail is already well blazed, nothing new in this version.

I have no problem adding this to 9.0 if we have a solid proposal for the
UI. It's low risk and makes the life easier for people. People are
clearly missing this.

Then again, if you don't use the copy in shared memory but just open the
pg_control file and read it in the UDF, you could implement this as a
pgfoundry module that works with older versions too.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2010-03-05 15:14:56 Re: SQL compatibility reminder: MySQL vs PostgreSQL
Previous Message Dave Page 2010-03-05 14:55:50 Re: SQL compatibility reminder: MySQL vs PostgreSQL