BUG #5288: Restoring a 7.4.5 -Fc dump using -j 2 segfaults (patch included)

From: "Jon Erdman (aka StuckMojo)" <postgresql(at)thewickedtribe(dot)net>
To: pgsql-bugs(at)postgresql(dot)org
Subject: BUG #5288: Restoring a 7.4.5 -Fc dump using -j 2 segfaults (patch included)
Date: 2010-01-19 05:09:00
Message-ID: 201001190509.o0J5908J065090@wwwmaster.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs


The following bug has been logged online:

Bug reference: 5288
Logged by: Jon Erdman (aka StuckMojo)
Email address: postgresql(at)thewickedtribe(dot)net
PostgreSQL version: 8.5devel, 8.4
Operating system: Debian Sid
Description: Restoring a 7.4.5 -Fc dump using -j 2 segfaults (patch
included)
Details:

So, I still run 7.4.5 for my medical billing app, and in playing around with
8.5 at AustinPUG last week I discovered that if I try to restore one of my
backups from 7.4 (produced with 7.4 pg_dump) into 8.5devel using 8.5
pg_restore and -j 2, it immediately segfaults. 8.4 does as well.

I built 8.5 with debug to get a bt and investigate what's going on, and I
found that it's a dependency in the archive TOC that is much higher than the
highest dump id in the TOC. This doesn't seem all that odd, considering the
comment right above the offending block says there can be deps to things
that aren't in the archive. This causes the code to index way off the end of
the array of TOC entries by dumpId.

After confirming that this dependency id is in the dumped TOC, I created a
patch to check for dumpIds greater than maxDumpId against today's 8.5devel,
which once applied causes the restore to proceed cleanly (and with 2 jobs).
This would be my first code contribution to PG, so I hope it gets accepted!
:) Oh, and in 8.4 the modified line number would be 3681.

Here is a gdb session showing what I found, and below that is my proposed
patch:

[jon(at)stuck daily]$ gdb --args /usr/local/stow/pgsql-8.5-dev/bin/pg_restore
-p 5435 -d debug -j 2 hef.20100113-000001.dmp
GNU gdb (GDB) 7.0-debian
Copyright (C) 2009 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i486-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /usr/local/stow/pgsql-8.5-dev/bin/pg_restore...done.
(gdb) run
Starting program: /usr/local/stow/pgsql-8.5-dev/bin/pg_restore -p 5435 -d
debug -j 2 hef.20100113-000001.dmp
[Thread debugging using libthread_db enabled]

Program received signal SIGSEGV, Segmentation fault.
0x08051865 in fix_dependencies (AH=0x8063190) at pg_backup_archiver.c:3736
3736 if (tocsByDumpId[te->dependencies[i] - 1] ==
NULL)
(gdb) bt
#0 0x08051865 in fix_dependencies (AH=0x8063190) at
pg_backup_archiver.c:3736
#1 0x08050aec in restore_toc_entries_parallel (AH=0x8063190)
at pg_backup_archiver.c:3097
#2 0x0804b4b2 in RestoreArchive (AHX=0x8063190, ropt=0x80630b8)
at pg_backup_archiver.c:366
#3 0x0804ac76 in main (argc=8, argv=0xbffff874) at pg_restore.c:380
(gdb) list
3731 */
3732 for (te = AH->toc->next; te != AH->toc; te = te->next)
3733 {
3734 for (i = 0; i < te->nDeps; i++)
3735 {
3736 if (tocsByDumpId[te->dependencies[i] - 1] ==
NULL)
3737 te->depCount--;
3738 }
3739 }
3740
(gdb) print *te
$1 = {prev = 0x806a028, next = 0x806a308, catalogId = {tableoid = 0, oid =
1001953},
dumpId = 311, section = SECTION_PRE_DATA, hadDumper = 0 '\000',
tag = 0x806a238 "plpgsql", namespace = 0x806a2d8 "public", tablespace =
0x0,
owner = 0x806a2e8 "", withOids = 1 '\001', desc = 0x806a248 "PROCEDURAL
LANGUAGE",
defn = 0x806a260 "CREATE TRUSTED PROCEDURAL LANGUAGE plpgsql HANDLER
plpgsql_call_handler;\n", dropStmt = 0x806a2b0 "DROP PROCEDURAL LANGUAGE
plpgsql;\n", copyStmt = 0x0,
dependencies = 0x806a2f8, nDeps = 1, dataDumper = 0, dataDumperArg = 0x0,
formatData = 0x806a490, par_prev = 0x0, par_next = 0x0, created = 0
'\000',
depCount = 1, lockDeps = 0x0, nLockDeps = 0}
(gdb) print *te->prev
$2 = {prev = 0x8069a10, next = 0x806a1c8, catalogId = {tableoid = 0, oid =
1001952},
dumpId = 312, section = SECTION_PRE_DATA, hadDumper = 0 '\000',
tag = 0x806a0a8 "plpgsql_call_handler()", namespace = 0x806a098 "public",
tablespace = 0x0, owner = 0x806a1a8 "postgres", withOids = 1 '\001',
desc = 0x806a0c8 "FUNC PROCEDURAL LANGUAGE",
defn = 0x806a0e8 "CREATE FUNCTION plpgsql_call_handler() RETURNS
language_handler\n AS '$libdir/plpgsql', 'plpgsql_call_handler'\n
LANGUAGE c;\n",
dropStmt = 0x806a170 "DROP FUNCTION public.plpgsql_call_handler();\n",
copyStmt = 0x0, dependencies = 0x0, nDeps = 0, dataDumper = 0,
dataDumperArg = 0x0,
formatData = 0x806a1b8, par_prev = 0x0, par_next = 0x0, created = 0
'\000',
depCount = 0, lockDeps = 0x0, nLockDeps = 0}
(gdb) print te->dependencies[i]
$3 = 1001952
(gdb) print AH->maxDumpId
$4 = 583
(gdb) print *tocsByDumpId[582]
$5 = {prev = 0x80c13f0, next = 0x80c1b40, catalogId = {tableoid = 0, oid =
1813687},
dumpId = 583, section = SECTION_POST_DATA, hadDumper = 0 '\000',
tag = 0x80c17c0 "providernumber_vw_upd", namespace = 0x80c1b10 "public",
tablespace = 0x0, owner = 0x80c1b20 "postgres", withOids = 1 '\001',
desc = 0x80c17b0 "RULE",
defn = 0x80c17e0 "CREATE RULE providernumber_vw_upd AS ON UPDATE TO
providernumber_vw DO INSTEAD UPDATE tblprovidernumber SET idoctorid =
new.idoctorid, gofficeid = new.gofficeid, iuserid = new.iuserid,
iinsurancecoid "..., dropStmt = 0x80c1b00 "",
copyStmt = 0x0, dependencies = 0x0, nDeps = 0, dataDumper = 0,
dataDumperArg = 0x0,
formatData = 0x80c1b30, par_prev = 0x0, par_next = 0x0, created = 0
'\000',
depCount = 0, lockDeps = 0x0, nLockDeps = 0}
(gdb) print *tocsByDumpId[583]
Cannot access memory at address 0x1569
(gdb)

And my patch:

diff --git a/src/bin/pg_dump/pg_backup_archiver.c
b/src/bin/pg_dump/pg_backup_archiver.c
index dda13ce..1b688ef 100644
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
*************** fix_dependencies(ArchiveHandle *AH)
*** 3733,3740 ****
{
for (i = 0; i < te->nDeps; i++)
{
! if (tocsByDumpId[te->dependencies[i] - 1] == NULL)
te->depCount--;
}
}

--- 3733,3743 ----
{
for (i = 0; i < te->nDeps; i++)
{
! if (te->dependencies[i] > AH->maxDumpId ||
! tocsByDumpId[te->dependencies[i] - 1] == NULL)
! {
te->depCount--;
+ }
}
}

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message yua ゅぁ 2010-01-19 10:26:10 Re: BUG #5284: Postgres CPU 100% and worker took too long to start; cancelled... Systemdown
Previous Message Craig Ringer 2010-01-19 04:38:53 Re: Questions