Skip site navigation (1) Skip section navigation (2)

Re: pg_dump and pgpool

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: smarlowe(at)g2switchworks(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump and pgpool
Date: 2004-12-30 15:46:29
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-general
> On Wed, 2004-12-29 at 17:30, Tom Lane wrote:
> > Scott Marlowe <smarlowe(at)g2switchworks(dot)com> writes:
> > > On Wed, 2004-12-29 at 16:56, Tom Lane wrote:
> > >> No, we'd be throwing more, and more complex, queries.  Instead of a
> > >> simple lookup there would be some kind of join, or at least a lookup
> > >> that uses a multicolumn key.
> > 
> > > I'm willing to bet the performance difference is less than noise.
> > 
> > [ shrug... ]  I don't have a good handle on that, and neither do you.
> > What I am quite sure about though is that pg_dump would become internally
> > a great deal messier and harder to maintain if it couldn't use OIDs.
> > Look at the DumpableObject manipulations and ask yourself what you're
> > going to do instead if you have to use a primary key that is of a
> > different kind (different numbers of columns and datatypes) for each
> > system catalog.  Ugh.
> Wait, do you mean it's impossible to throw a single SQL query with a
> proper join clause that USES OIDs but doesn't return them?  Or that it's
> impossible to throw a single query without joining on OIDs.  I don't
> mind joining on OIDs, I just don't want them crossing the connection is
> all.  And yes, it might be ugly, but I can't imagine it being
> unmaintable for some reason.
> > I don't think it's worth that price to support a fundamentally bogus
> > approach to backup.
> But it's not bogus.  IT allows me to compare two databases running under
> a pgpool synchronous cluster and KNOW if there are inconsistencies in
> data between them, so it is quite useful to me.
> > IMHO you don't want extra layers of software in
> > between pg_dump and the database --- each one just introduces another
> > risk of getting a wrong backup.  You've yet to explain what the
> > *benefit* of putting pgpool in there is for this problem.
> Actually, it ensures that I get the right backup, because pgpool will
> cause the backup to fail if there are any differences between the two
> backend servers, thus telling me that I have an inconsistency.
> That's the primary reason I want this.  The secondary reason, which I
> can work around, is that I'm running the individual databases on
> machines that only answer the specific IP of the pgpool machine's IP, so
> remote backups aren't possible, and only the pgpool machine would be
> capable of doing the backups, but we have (like so many other companies)
> a centralized backup server.  I can always allow that machine to connect
> to the database(s) to do backup, but my fear is that by allowing
> anything other than pgpool to hit those backend databases they could be
> placed out of sync with each other.  Admitted, a backup process
> shouldn't be updating the database, so this, as I said, isn't really a
> big deal.  More of a mild kink really.  As long as all access is
> happening through pgpool, they should stay coherent to each other.

Pgpool could be modified so that it has "no SELECT replication mode",
where pgpool runs SELECT on only master server. I could do this if you
think it's usefull.

However problem is pg_dump is not only running SELECT but also
modifying database (counting up OID counter), i.e. it creates
temporary tables. Is this a problem for you?
Tatsuo Ishii

In response to


pgsql-general by date

Next:From: Scott MarloweDate: 2004-12-30 17:22:19
Subject: Re: pg_dump and pgpool
Previous:From: Greg StarkDate: 2004-12-30 15:36:41
Subject: Re: pg_dump and pgpool

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group