Quick Links

Re: pg_dump and pgpool

From:	Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
To:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
Cc:	tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-general(at)postgresql(dot)org
Subject:	Re: pg_dump and pgpool
Date:	2004-12-30 17:22:19
Message-ID:	1104427339.5893.65.camel@state.g2switchworks.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Thu, 2004-12-30 at 09:46, Tatsuo Ishii wrote:
> > On Wed, 2004-12-29 at 17:30, Tom Lane wrote:
> > > Scott Marlowe <smarlowe(at)g2switchworks(dot)com> writes:
> > > > On Wed, 2004-12-29 at 16:56, Tom Lane wrote:
> > > >> No, we'd be throwing more, and more complex, queries. Instead of a
> > > >> simple lookup there would be some kind of join, or at least a lookup
> > > >> that uses a multicolumn key.
> > >
> > > > I'm willing to bet the performance difference is less than noise.
> > >
> > > [ shrug... ] I don't have a good handle on that, and neither do you.
> > > What I am quite sure about though is that pg_dump would become internally
> > > a great deal messier and harder to maintain if it couldn't use OIDs.
> > > Look at the DumpableObject manipulations and ask yourself what you're
> > > going to do instead if you have to use a primary key that is of a
> > > different kind (different numbers of columns and datatypes) for each
> > > system catalog. Ugh.
> >
> > Wait, do you mean it's impossible to throw a single SQL query with a
> > proper join clause that USES OIDs but doesn't return them? Or that it's
> > impossible to throw a single query without joining on OIDs. I don't
> > mind joining on OIDs, I just don't want them crossing the connection is
> > all. And yes, it might be ugly, but I can't imagine it being
> > unmaintable for some reason.
> >
> > > I don't think it's worth that price to support a fundamentally bogus
> > > approach to backup.
> >
> > But it's not bogus. IT allows me to compare two databases running under
> > a pgpool synchronous cluster and KNOW if there are inconsistencies in
> > data between them, so it is quite useful to me.
> >
> > > IMHO you don't want extra layers of software in
> > > between pg_dump and the database --- each one just introduces another
> > > risk of getting a wrong backup. You've yet to explain what the
> > > *benefit* of putting pgpool in there is for this problem.
> >
> > Actually, it ensures that I get the right backup, because pgpool will
> > cause the backup to fail if there are any differences between the two
> > backend servers, thus telling me that I have an inconsistency.
> >
> > That's the primary reason I want this. The secondary reason, which I
> > can work around, is that I'm running the individual databases on
> > machines that only answer the specific IP of the pgpool machine's IP, so
> > remote backups aren't possible, and only the pgpool machine would be
> > capable of doing the backups, but we have (like so many other companies)
> > a centralized backup server. I can always allow that machine to connect
> > to the database(s) to do backup, but my fear is that by allowing
> > anything other than pgpool to hit those backend databases they could be
> > placed out of sync with each other. Admitted, a backup process
> > shouldn't be updating the database, so this, as I said, isn't really a
> > big deal. More of a mild kink really. As long as all access is
> > happening through pgpool, they should stay coherent to each other.
>
> Pgpool could be modified so that it has "no SELECT replication mode",
> where pgpool runs SELECT on only master server. I could do this if you
> think it's usefull.
>
> However problem is pg_dump is not only running SELECT but also
> modifying database (counting up OID counter), i.e. it creates
> temporary tables. Is this a problem for you?

Does it? I didn't know it used temp tables. It's not that big of a
deal, and I'm certain I can work around it. I just really like the idea
of a cluster of pg servers running sychronously behind a redirector and
looking, for all the world, like one database. But I think it would
take log shipping for it to work the way I'm envisioning. I'd much
rather see work go into making pgpool run atop >2 servers than this
exercise in (_very_) likely futility.

In response to

Re: pg_dump and pgpool at 2004-12-30 15:46:29 from Tatsuo Ishii

Browse pgsql-general by date

	From	Date	Subject
Next Message	Secrétariat	2004-12-30 17:35:23	Update rule
Previous Message	Tatsuo Ishii	2004-12-30 15:46:29	Re: pg_dump and pgpool