Re: db replication and errors

From: Richard Huxton <dev(at)archonet(dot)com>
To: Benjamin <benjamin(at)netyantra(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: db replication and errors
Date: 2005-02-18 16:31:53
Message-ID: 42161879.2060005@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Benjamin wrote:
> Thanx for the pointers, Richard.
>
>>> If the async feature is used on the primary, shud we
>>> copy on the xlog and clog files onto the backup as
>>> well?
>>
>>
>>
>> What is the "async feature"?
>
>
> I meant fsync.
> I meant to ask, if FSYNC is enabled, is all pending data written onto
> the disk?

Yes. Turn it on if you want your data to survive a power failure.
Oh, and make sure your disks aren't write-caching even when you sync.
Search the list archives for cache and IDE for plenty of discussion.

>>
>>> As of now, we shut down postmaster, on the Primary
>>> whenever the standby boots up, and then copy all the above said files,
>>> from the primary to the standby.
>>>
>>> Duz this ensure, all data is written onto the disk b4
>>> postmaster shut downs?
>>
>> Provided the postmaster shuts down cleanly, and you've synced to disk
>> then all should be OK.
>
>
> Wot decides this "sync" n how do i check it?
>
>>> Quite a few times, I have encountered errors, like, xlogflush is not
>>> satisfied,
>>> bogus attribute number for <some num , eg. -2>, catalog is missing,
>>> cache lookup failed.
>>
>>
>> One of 4 things could be at fault:
>> 1. Files aren't being sync'ed to disk
>> 2. You aren't copying the right files
>> 3. The versions of PG don't match
>> 4. The platforms you are running on are different (e.g. Sun-Sparc vs
>> x86)
>
>
> The latter two are not the case, I use Redhat 9 on all the machines,
> with PG VERSION 7.3
> The former two, yes, I agree, cud be the cause of problems.
>
> I would like to know, where to look on such errors.eg for cache lookup
> failure, wot triggers that??how do i get abt tracking down the issue?

A cache lookup failure is usually due to the OID of an object changing,
where you drop/recreate a temporary table and a function is still
referring to its old OID.
In your case, I'm not sure what's causing the problem. It could be
you've not copied the table definitions over and you've updated your
schema on the original machine.

>> It might be worth looking at "slony" to run a replication setup,
>> rather than copying files.
>
> Did think of slony previously. But slony has the limitation of not being
> able to replicate large objects, rite?
> How large are these large objects supposed to be?

Um, large as you like. See the manuals for discussion of large object
support. I'm guessing you're not using it.

> Run-time replication is not an issue, as I have other mechanisms for
> that, which are part of this server, and they work fine.
> The only problem I am facing now, is of the case when the standby is
> booting up. I have to ensure an absolutely correct copying of files.

If you've got a replicated version of the database why bother copying
the files?
Thinking about it, why copy the files at all anyway? If the server is
still running why has PG stopped?

> I want to know how do i go about diagnosing problems, if and when they
> arise.
> I have come across pg_filedump. But cant really make out much frm the
> output that pg_filedump produces.

If you have *any* problems, then the file copy didn't work. Bin it and
restore from backup. It's only when you don't have a backup that it's
worse messing with pg_filedump.

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dieter Schröder 2005-02-18 16:40:51 PostgreSQL Replication
Previous Message Mohsen Pahlevanzadeh 2005-02-18 16:26:39 Re: I'm newbie