Re: standby recovery fails (tablespace related) (tentative patch and discussion)

From: Paul Guo <pguo(at)pivotal(dot)io>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Asim R P <apraveen(at)pivotal(dot)io>, Alexandra Wang <leiwang(at)pivotal(dot)io>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: standby recovery fails (tablespace related) (tentative patch and discussion)
Date: 2020-01-13 10:27:16
Message-ID: CAEET0ZGpfnTdRN4GCKPPPsFK03VnqiyGvyRPW+cY5STbdvyB0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 10, 2020 at 9:43 PM Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
wrote:

> On 2020-Jan-09, Alvaro Herrera wrote:
>
> > I looked at this a little while and was bothered by the perl changes; it
> > seems out of place to have RecursiveCopy be thinking about tablespaces,
> > which is way out of its league. So I rewrote that to use a callback:
> > the PostgresNode code passes a callback that's in charge to handle the
> > case of a symlink. Things look much more in place with that. I didn't
> > verify that all places that should use this are filled.
> >
> > In 0002 I found adding a new function unnecessary: we can keep backwards
> > compat by checking 'ref' of the third argument. With that we don't have
> > to add a new function. (POD changes pending.)
>
> I forgot to add that something in these changes is broken (probably the
> symlink handling callback) so the tests fail, but I couldn't stay away
> from my daughter's birthday long enough to figure out what or how. I'm
> on something else today, so if one of you can research and submit fixed
> versions, that'd be great.
>
> Thanks,
>

I spent some time on this before getting off work today.

With below fix, the 4th test is now ok but the 5th (last one) hangs due to
panic.

(gdb) bt
#0 0x0000003397e32625 in raise () from /lib64/libc.so.6
#1 0x0000003397e33e05 in abort () from /lib64/libc.so.6
#2 0x0000000000a90506 in errfinish (dummy=0) at elog.c:590
#3 0x0000000000a92b4b in elog_finish (elevel=22, fmt=0xb2d580 "cannot find
directory %s tablespace %d database %d") at elog.c:1465
#4 0x000000000057aa0a in XLogLogMissingDir (spcNode=16384, dbNode=0,
path=0x1885100 "pg_tblspc/16384/PG_13_202001091/16389") at xlogutils.c:104
#5 0x000000000065e92e in dbase_redo (record=0x1841568) at dbcommands.c:2225
#6 0x000000000056ac94 in StartupXLOG () at xlog.c:7200

diff --git a/src/include/commands/dbcommands.h
b/src/include/commands/dbcommands.h
index b71b400e700..f8f6d5ffd03 100644
--- a/src/include/commands/dbcommands.h
+++ b/src/include/commands/dbcommands.h
@@ -19,8 +19,6 @@
#include "lib/stringinfo.h"
#include "nodes/parsenodes.h"

-extern void CheckMissingDirs4DbaseRedo(void);
-
extern Oid createdb(ParseState *pstate, const CreatedbStmt *stmt);
extern void dropdb(const char *dbname, bool missing_ok, bool force);
extern void DropDatabase(ParseState *pstate, DropdbStmt *stmt);
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index e6e7ea505d9..4eef8bb1985 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -615,11 +615,11 @@ sub _srcsymlink
my $srcrealdir = readlink($srcpath);

opendir(my $dh, $srcrealdir);
- while (readdir $dh)
+ while (my $entry = (readdir $dh))
{
- next if (/^\.\.?$/);
- my $spath = "$srcrealdir/$_";
- my $dpath = "$dstrealdir/$_";
+ next if ($entry eq '.' or $entry eq '..');
+ my $spath = "$srcrealdir/$entry";
+ my $dpath = "$dstrealdir/$entry";
RecursiveCopy::copypath($spath, $dpath);
}
closedir $dh;

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2020-01-13 10:46:01 Re: benchmarking Flex practices
Previous Message Amit Kapila 2020-01-13 10:13:32 Re: Comment fix in session.h