Re: Cluster seems broken after pg_basebackup

From: Guillaume Lelarge <guillaume(at)lelarge(dot)info>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: Guillaume Drolet <droletguillaume(at)gmail(dot)com>, PostgreSQL General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Cluster seems broken after pg_basebackup
Date: 2015-02-07 06:24:43
Message-ID: CAECtzeVv7wkFwJ3Avjfnvh-Y43mJ5S9H3M0yd97d8h15=GRafA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Le 6 févr. 2015 17:31, "Adrian Klaver" <adrian(dot)klaver(at)aklaver(dot)com> a écrit :
>
> On 02/06/2015 05:03 AM, Guillaume Drolet wrote:
>>
>> Hi,
>>
>> Yesterday I ran a pg_basebackup of my cluster. Since it has completed,
>> my cluster doesn't work properly. I tried restarting the computer (or
>> service) a few times but I always get the same messages in my logs (it's
>> in French. If someone is willing to help me I can try to translate the
>> logs. Just ask):
>
>
> Enter Google Translate:)
>

But first, Guillaume, do yourself and everyone else a favor: turn the dam
log into English. Set lc_messages to 'C' in postgresql.conf.

> First some questions:
>
> 1) What Postgres version?
>
> 2) What OS(s)? I am assuming Windows from the log info below, but we all
know what assuming gets you.
>
> 3) Where were you backing up from and to?
>
> 4) Which cluster does not start, the master or the child you created with
pg_basebackup?
>
>
>>
>> 2015-02-06 07:11:38 EST LOG: le système de bases de données a été
>> interrompu ; dernier lancement connu à 2015-02-06 07:05:05 EST
>> 2015-02-06 07:11:38 EST LOG: le système de bases de données n'a pas été
>> arrêté proprement ; restauration
>> automatique en cours
>> 2015-02-06 07:11:38 EST LOG: record with zero length at 24B/2C000160
>> 2015-02-06 07:11:38 EST LOG: la ré-exécution n'est pas nécessaire
>> 2015-02-06 07:11:38 EST LOG: le système de bases de données est prêt
>> pour accepter les connexions
>> 2015-02-06 07:11:38 EST LOG: lancement du processus autovacuum
>> 2015-02-06 07:11:38 EST FATAL: le rôle « 208375PT$ » n'existe pas
>
>
> So where is role 208375PT$ supposed to come from?
>
>
>>
>> Then if I start pgAdmin I get a series of pop-ups I have to click OK to
>> to continue:
>>
>> An error has ocurred: Column not found in pgSet: "datlastsysoid"
>> An error has ocurred: Column not found in pgSet: datlastsysoid
>> An error has ocurred: Column not found in pgSet: oid
>> An error has ocurred: Column not found in pgSet: encoding
>> An error has ocurred: Column not found in pgSet: Connection to database
>> broken
>
>
> Not sure about that this, someone more versed in pgAdmin will have to
answer.
>

Usually you see these messages when you're using a pgadmin major release
older than a PostgreSQL make release. For a 9.3 release, that would mean a
pgadmin older than 1.18.

>
>>
>> And after that, I went back to the log file and there's new information
>> added:
>>
>> 2015-02-06 07:51:05 EST LOG: processus serveur (PID 184) a été arrêté
>> par l'exception 0x80000004
>> 2015-02-06 07:51:05 EST DÉTAIL: Le processus qui a échoué exécutait :
>> SELECT version();
>> 2015-02-06 07:51:05 EST ASTUCE : Voir le fichier d'en-tête C «
>> ntstatus.h » pour une description de la valeur
>> hexadécimale.
>
>
> Well according to here:
>
> https://msdn.microsoft.com/en-us/library/cc704588.aspx
>
> 0x80000004
> STATUS_SINGLE_STEP
>
>
> {EXCEPTION} Single Step A single step or trace operation has just been
completed.
>
> A developer is going to have explain what that means.
>
>
>
>> 2015-02-06 07:51:05 EST LOG: arrêt des autres processus serveur actifs
>> 2015-02-06 07:51:05 EST ATTENTION: arrêt de la connexion à cause de
>> l'arrêt brutal d'un autre processus serveur
>> 2015-02-06 07:51:05 EST DÉTAIL: Le postmaster a commandé à ce processus
>> serveur d'annuler la transaction
>> courante et de quitter car un autre processus serveur a quitté
>> anormalement
>> et qu'il existe probablement de la mémoire partagée corrompue.
>> 2015-02-06 07:51:05 EST ASTUCE : Dans un moment, vous devriez être
>> capable de vous reconnecter à la base de
>> données et de relancer votre commande.
>> 2015-02-06 07:51:05 EST LOG: processus d'archivage (PID 692) quitte
>> avec le code de sortie 1
>> 2015-02-06 07:51:05 EST LOG: tous les processus serveur se sont
>> arrêtés, réinitialisation
>> 2015-02-06 07:51:15 EST FATAL: le bloc de mémoire partagé pré-existant
>> est toujours en cours d'utilisation
>> 2015-02-06 07:51:15 EST ASTUCE : Vérifier s'il n'y a pas de vieux
>> processus serveur en cours d'exécution. Si c'est le
>> cas, fermez-les.
>>
>> I was about to try restarting postgresql using the base backup I made
>> yesterday but since this means I'll have to copy my database again (700
>> GB takes a while...) I am looking for a better solution from more
>> experienced people.
>
>
>
> My suspicion is you copied at least partly over a running server.
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alexey Slynko 2015-02-07 07:47:03 PgConf Russia event
Previous Message Elijah Zupancic 2015-02-07 00:16:50 Fwd: [BUGS] pg_dump search path issue