Re: pg_dump and pgpool

From: Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To: smarlowe(at)g2switchworks(dot)com
Cc: tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-general(at)postgresql(dot)org
Subject: Re: pg_dump and pgpool
Date: 2004-12-30 15:46:29
Message-ID: 20041231.004629.57439590.t-ishii@sra.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> On Wed, 2004-12-29 at 17:30, Tom Lane wrote:
> > Scott Marlowe <smarlowe(at)g2switchworks(dot)com> writes:
> > > On Wed, 2004-12-29 at 16:56, Tom Lane wrote:
> > >> No, we'd be throwing more, and more complex, queries. Instead of a
> > >> simple lookup there would be some kind of join, or at least a lookup
> > >> that uses a multicolumn key.
> >
> > > I'm willing to bet the performance difference is less than noise.
> >
> > [ shrug... ] I don't have a good handle on that, and neither do you.
> > What I am quite sure about though is that pg_dump would become internally
> > a great deal messier and harder to maintain if it couldn't use OIDs.
> > Look at the DumpableObject manipulations and ask yourself what you're
> > going to do instead if you have to use a primary key that is of a
> > different kind (different numbers of columns and datatypes) for each
> > system catalog. Ugh.
>
> Wait, do you mean it's impossible to throw a single SQL query with a
> proper join clause that USES OIDs but doesn't return them? Or that it's
> impossible to throw a single query without joining on OIDs. I don't
> mind joining on OIDs, I just don't want them crossing the connection is
> all. And yes, it might be ugly, but I can't imagine it being
> unmaintable for some reason.
>
> > I don't think it's worth that price to support a fundamentally bogus
> > approach to backup.
>
> But it's not bogus. IT allows me to compare two databases running under
> a pgpool synchronous cluster and KNOW if there are inconsistencies in
> data between them, so it is quite useful to me.
>
> > IMHO you don't want extra layers of software in
> > between pg_dump and the database --- each one just introduces another
> > risk of getting a wrong backup. You've yet to explain what the
> > *benefit* of putting pgpool in there is for this problem.
>
> Actually, it ensures that I get the right backup, because pgpool will
> cause the backup to fail if there are any differences between the two
> backend servers, thus telling me that I have an inconsistency.
>
> That's the primary reason I want this. The secondary reason, which I
> can work around, is that I'm running the individual databases on
> machines that only answer the specific IP of the pgpool machine's IP, so
> remote backups aren't possible, and only the pgpool machine would be
> capable of doing the backups, but we have (like so many other companies)
> a centralized backup server. I can always allow that machine to connect
> to the database(s) to do backup, but my fear is that by allowing
> anything other than pgpool to hit those backend databases they could be
> placed out of sync with each other. Admitted, a backup process
> shouldn't be updating the database, so this, as I said, isn't really a
> big deal. More of a mild kink really. As long as all access is
> happening through pgpool, they should stay coherent to each other.

Pgpool could be modified so that it has "no SELECT replication mode",
where pgpool runs SELECT on only master server. I could do this if you
think it's usefull.

However problem is pg_dump is not only running SELECT but also
modifying database (counting up OID counter), i.e. it creates
temporary tables. Is this a problem for you?
--
Tatsuo Ishii

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Scott Marlowe 2004-12-30 17:22:19 Re: pg_dump and pgpool
Previous Message Greg Stark 2004-12-30 15:36:41 Re: pg_dump and pgpool