Re: [PATCHES] Writing WAL for relcache invalidation:pg_internal.init

From: Jerry Sievers <jerry(at)jerrysievers(dot)com>
To: "Simon Riggs" <simon(at)2ndquadrant(dot)com>
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] Writing WAL for relcache invalidation:pg_internal.init
Date: 2006-11-02 14:36:10
Message-ID: m3ejsldig5.fsf@homie.jerrysievers.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Tom, Simon et al; Please clarify.

PostgreSQL 8.1.5 on sparc-sun-solaris2.9, compiled by GCC gcc (GCC) 3.4.2

We're getting ready to init a new warm standby instance based on last
night's snapshot of running prod server. I see a few of these
pg_internal.init files in the cluster as it's being unpacked.

Same warm standby instance may sit for weeks gobbling up WALs from the
prod server then be finally brought live.

Question;

Is it safe to delete the .init files now (before starting recovery),
or perhaps unconditionally right after going live?

In other words, is there any sure fire preventative measure that I can
apply to guarantee against the fault that started this threadd?

Tom wrote:
> Meanwhile, if you're trying to recover from a PITR backup and it's not
> working, try removing any pg_internal.init files you can find.

Comment above seems to suggest not touching existing pg_internal.init
files unless a problem is seen.

Thanks

"Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:

> On Wed, 2006-11-01 at 12:05 -0500, Tom Lane wrote:
>
> > "Simon Riggs" <simon(at)2ndquadrant(dot)com> writes:
> > > Enclose a patch for new WAL records for relcache invalidation.
> >
> > I don't think this works. RelationCacheInitFileInvalidate is executed
> > post-commit, which means that there's a window between commit and where
> > you propose to write the WAL entry. A crash and restart in that
> > interval would leave the catalog changes committed, but not reflected
> > into pg_internal.init.
>
> Surely you are pointing out a bug, no?
>
> If a backend did crash, the init file would be wrong and we'd get
> exactly the same wrong relfilenode errors we got after that PITR.
>
> The issue must surely be that the patch isn't wrong per se, just that
> RelationCacheInitFileInvalidate is called too late and that requires an
> additional fix. Are we certain that a crash between commit and
> invalidation will cause a PANIC that takes down the server? Doesn't look
> like its in a critical section to me.
>
> > I think we're probably better off to just forcibly remove the init file
> > during post-recovery cleanup. The easiest place to do this might be
> > BuildFlatFiles, which has to scan pg_database anyway ...
>
> I can do this - I don't have a problem there, but the above issue just
> occurred to me so I wonder now if its the right thing to do.
>
> PITR will be always-safe but normal operation might not be.
>
> --
> Simon Riggs
> EnterpriseDB http://www.enterprisedb.com
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly
>

--
-------------------------------------------------------------------------------
Jerry Sievers 305 854-3001 (home) WWW ECommerce Consultant
305 321-1144 (mobile http://www.JerrySievers.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2006-11-02 14:44:24 Re: [PATCHES] Writing WAL for relcacheinvalidation:pg_internal.init
Previous Message Hiroshi Saito 2006-11-02 10:47:24 Encoding problem

Browse pgsql-patches by date

  From Date Subject
Next Message Simon Riggs 2006-11-02 14:44:24 Re: [PATCHES] Writing WAL for relcacheinvalidation:pg_internal.init
Previous Message Simon Riggs 2006-11-02 08:45:39 Re: Writing WAL for relcache invalidation:pg_internal.init