From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Hot Standby, release candidate? |
Date: | 2009-12-13 22:22:06 |
Message-ID: | 1260742926.1955.195.camel@ebony |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, 2009-12-13 at 15:45 -0500, Tom Lane wrote:
> Simon Riggs <simon(at)2ndQuadrant(dot)com> writes:
> > * NonTransactionalInvalidation logging has been removed following
> > review, but AFAICS that means VACUUM FULL doesn't work correctly on
> > catalog tables, which regrettably will be the only ones still standing
> > even after we apply VFI patch. Did I misunderstand the original intent?
> > Was it just buggy somehow? Or is this hoping VF goes completely, which
> > seems unlikely in this release.
>
> For my money, the only reason VF is still around is there hasn't been
> an urgent reason to get rid of it. If it doesn't play with HS, I think
> we'd be better served to put work into getting rid of it than to put
> work into fixing it.
I see the logic, though it has many implications. I'll step up, if I can
get some help from you and Itagaki on the VF side.
You have a rough design here
http://archives.postgresql.org/message-id/19750.1252094460@sss.pgh.pa.us
Some thoughts and some further work on a detailed design
* Which exact tables are we talking about: just pg_class and the shared
catalogs? Everything else is in pg_class, so if we can find it we're OK?
formrdesc() tells me the list of nailed relations is: pg_database,
pg_class, pg_attribute, pg_proc, and pg_type. Are the nailed relations
the ones we care about, or are they just a subset?
* Restrict set of operations to *only* VACUUM FULL. Is there a need for
anything else to do this, at least in this release?
* Each backend needs to access two map files: shared and local
* Get relcache to read map files at startup in formrdesc(). Rather than
use RelationInitPhysicalAddr() set relation->rd_node.relNode directly
* Get VF to write a new type of invalidation message that means re-read
the two map files to overwrite the relation->rd_node.relNode in the
nailed relations
* Map files would have a very structured format, so each table listed
has its exact place. Sounds like best place for shared catalogs is
pg_control. We only need a few additional bytes for that and everything
else to manipulate it already exists.
* Map files for specific databases would be called pg_database_control,
with roughly same concepts as pg_control. It's then an obvious place to
add any further db specific things in future, if we need them.
* Protect all map files reading/writing using ControlFileLock. Sequence
of update is acquire lock, send invalidation, rewrite file, release lock
all inside a critical section. Readers would take shared, writers
exclusive.
* Work would be in two tranches: add new way of working then later
remove code we don't need; I would actually rather do the second part at
start of next dev cycle.
--
Simon Riggs www.2ndQuadrant.com
From | Date | Subject | |
---|---|---|---|
Next Message | Takahiro Itagaki | 2009-12-14 00:26:23 | Re: Largeobject Access Controls and pg_migrator |
Previous Message | James Pye | 2009-12-13 22:02:26 | Re: plpython3 |