Re: BUG #16147: postgresql 12.1 (from homebrew) - pg_restore -h localhost --jobs=2 crashes

From: David Zhang <david(dot)zhang(at)highgo(dot)ca>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16147: postgresql 12.1 (from homebrew) - pg_restore -h localhost --jobs=2 crashes
Date: 2020-03-06 03:53:35
Message-ID: 2a1b69ea-6a33-cacb-8b3e-32886cb84368@highgo.ca
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Hi,

I can reproduce this pg_restore crash issue (pg_dump crash too when
running with multiple jobs) on MacOS 10.14 Mojave and MacOS 10.15
Catalina using following steps.

1. build pg_resotre from 12.2 with "--with-gssapi" enabled, or use the
release from https://www.postgresql.org/download/macosx/

2. start pg server and generate some load,
    pgbench -i -p 5432 -d postgres -s 10"

3. backup database,
    pg_dump -h localhost -Fc --no-acl --no-owner postgres > /tmp/128m

4. drop the tables,
    psql -d postgres -c "drop table pgbench_accounts; drop table
pgbench_branches; drop table pgbench_history; drop table pgbench_tellers;"

5. restore database,
    pg_restore -d postgres -h localhost -Fc /tmp/128m --jobs=2
    Password:
    pg_restore: error: a worker process died unexpectedly

6. check tables, all display size 0 bytes.
    postgres=# \d+
                              List of relations
     Schema |       Name       | Type  |  Owner   |  Size   | Description
--------+------------------+-------+----------+---------+-------------
     public | pgbench_accounts | table | postgres | 0 bytes |
     public | pgbench_branches | table | postgres | 0 bytes |
     public | pgbench_history  | table | postgres | 0 bytes |
     public | pgbench_tellers  | table | postgres | 0 bytes |
    (4 rows)

7. core dump, about 2G,
(lldb) bt all
* thread #1, stop reason = signal SIGSTOP
  * frame #0: 0x00007fff6c29c44e
libdispatch.dylib`_dispatch_mgr_queue_push + 41
    frame #1: 0x00007fff41475a74
Security`___ZN8Security12KeychainCore14StorageManager14tickleKeychainEPNS0_12KeychainImplE_block_invoke_2
+ 76
    frame #2: 0x00007fff6c29250e
libdispatch.dylib`_dispatch_client_callout + 8
    frame #3: 0x00007fff6c29e567
libdispatch.dylib`_dispatch_lane_barrier_sync_invoke_and_complete + 60
    frame #4: 0x00007fff41475935
Security`Security::KeychainCore::StorageManager::tickleKeychain(Security::KeychainCore::KeychainImpl*)
+ 485
    frame #5: 0x00007fff412400d8
Security`Security::KeychainCore::KCCursorImpl::next(Security::KeychainCore::Item&)
+ 352
    frame #6: 0x00007fff41417975
Security`Security::KeychainCore::IdentityCursor::next(Security::SecPointer<Security::KeychainCore::Identity>&)
+ 217
    frame #7: 0x00007fff4143c4c3 Security`SecIdentitySearchCopyNext + 155
    frame #8: 0x00007fff414477d8
Security`SecItemCopyMatching_osx(__CFDictionary const*, void const**) + 261
    frame #9: 0x00007fff4144b024 Security`SecItemCopyMatching + 338
    frame #10: 0x00007fff56dab303 Heimdal`keychain_query + 531
    frame #11: 0x00007fff56da8f4c Heimdal`hx509_certs_find + 92
    frame #12: 0x00007fff56d67b52 Heimdal`_krb5_pk_find_cert + 466
    frame #13: 0x00007fff376da9bb GSS`_gsspku2u_acquire_cred + 619
    frame #14: 0x00007fff376bfc1c GSS`gss_acquire_cred + 940
    frame #15: 0x000000010016e6e1
libpq.5.dylib`pg_GSS_have_cred_cache(cred_out=0x0000000100505688) at
fe-gssapi-common.c:67:10
    frame #16: 0x000000010014f769
libpq.5.dylib`PQconnectPoll(conn=0x0000000100505310) at fe-connect.c:2785:22
    frame #17: 0x000000010014be9f
libpq.5.dylib`connectDBComplete(conn=0x0000000100505310) at
fe-connect.c:2095:10
    frame #18: 0x000000010014bb0c
libpq.5.dylib`PQconnectdbParams(keywords=0x00007ffeefbfeee0,
values=0x00007ffeefbfeea0, expand_dbname=1) at fe-connect.c:625:10
    frame #19: 0x000000010000ec20
pg_restore`ConnectDatabase(AHX=0x0000000100505070, dbname="postgres",
pghost="david.highgo.ca", pgport=0x0000000000000000, username="david",
prompt_password=TRI_DEFAULT) at pg_backup_db.c:287:20
    frame #20: 0x000000010000a75a
pg_restore`CloneArchive(AH=0x00000001002020f0) at
pg_backup_archiver.c:4850:3
    frame #21: 0x0000000100017b4b
pg_restore`RunWorker(AH=0x00000001002020f0, slot=0x0000000100221718) at
parallel.c:866:7
    frame #22: 0x00000001000179f5
pg_restore`ParallelBackupStart(AH=0x00000001002020f0) at parallel.c:1028:4
    frame #23: 0x0000000100004473
pg_restore`RestoreArchive(AHX=0x00000001002020f0) at
pg_backup_archiver.c:662:12
    frame #24: 0x0000000100001be4 pg_restore`main(argc=10,
argv=0x00007ffeefbff8f0) at pg_restore.c:447:3
    frame #25: 0x00007fff6c2eb7fd libdyld.dylib`start + 1
(lldb)

8. however it works with either,
    PGGSSENCMODE=disable pg_restore -d postgres -h localhost -Fc
/tmp/128m --jobs=2
or,
    pg_restore -d "dbname=postgres gssencmode=disable" -h localhost -Fc
/tmp/128m --jobs=2

9. pg_config output and versions, no SSL configured,

$ pg_config
BINDIR = /Users/david/sandbox/pg122/app/bin
DOCDIR = /Users/david/sandbox/pg122/app/share/doc/postgresql
HTMLDIR = /Users/david/sandbox/pg122/app/share/doc/postgresql
INCLUDEDIR = /Users/david/sandbox/pg122/app/include
PKGINCLUDEDIR = /Users/david/sandbox/pg122/app/include/postgresql
INCLUDEDIR-SERVER = /Users/david/sandbox/pg122/app/include/postgresql/server
LIBDIR = /Users/david/sandbox/pg122/app/lib
PKGLIBDIR = /Users/david/sandbox/pg122/app/lib/postgresql
LOCALEDIR = /Users/david/sandbox/pg122/app/share/locale
MANDIR = /Users/david/sandbox/pg122/app/share/man
SHAREDIR = /Users/david/sandbox/pg122/app/share/postgresql
SYSCONFDIR = /Users/david/sandbox/pg122/app/etc/postgresql
PGXS =
/Users/david/sandbox/pg122/app/lib/postgresql/pgxs/src/makefiles/pgxs.mk
CONFIGURE = '--with-gssapi' '--prefix=/Users/david/sandbox/pg122/app'
'--enable-debug' 'CFLAGS=-ggdb -O0 -fno-omit-frame-pointer'
CC = gcc
CPPFLAGS = -isysroot
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.15.sdk
CFLAGS = -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Werror=vla -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -Wno-unused-command-line-argument -g -ggdb -O0
-fno-omit-frame-pointer
CFLAGS_SL =
LDFLAGS = -Wl,-dead_strip_dylibs
LDFLAGS_EX =
LDFLAGS_SL =
LIBS = -lpgcommon -lpgport -lgssapi_krb5 -lz -lreadline -lm
VERSION = PostgreSQL 12.2

$ lldb --version
lldb-1100.0.30.12
Apple Swift version 5.1.3 (swiftlang-1100.0.282.1 clang-1100.0.33.15)

$ klist --version
klist (Heimdal 1.5.1apple1)
Copyright 1995-2011 Kungliga Tekniska Högskolan
Send bug-reports to heimdal-bugs(at)h5l(dot)org

Hopefully the above information can help.

On 2019-12-04 6:03 a.m., Tom Lane wrote:
> PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
>> The following bug has been logged on the website:
>> Bug reference: 16147
>> Logged by: Bill Tihen
>> Email address: btihen(at)gmail(dot)com
>> PostgreSQL version: 12.1
>> Operating system: MacOS 10.15.1
>> Description:
>> The following command crashes with any database I've tried (both large and
>> small) DBs:
>> `pg_restore -U wti0405 -d stage3 -h localhost --jobs=8 -Fc
>> database_12_04-01-00.bak -x`
> I failed to reproduce this on my own 10.15.1 laptop, using manual
> builds of either HEAD or the v12 branch. Plausible reasons for
> the difference in results might include:
>
> * There's something different about the homebrew build (could we
> see the output of pg_config?)
>
> * There's something unusual about your configuration (one thought
> that comes to mind: do you have SSL turned on for localhost
> connections?)
>
> * There's something about the data in this specific database
> (your report that it happens for multiple databases puts a crimp
> in this idea, though maybe they all share a common feature)
>
> Anyway, we need more info to investigate. You might try looking
> into the server log to see what the failure looks like from that
> side --- is there a query error, or just the worker disconnecting
> unexpectedly?
>
> regards, tom lane
>
>
>
>
--
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Victor Yegorov 2020-03-06 09:38:20 Re: BUG #16285: bt_metap fails with value is out of range for type integer
Previous Message Peter Geoghegan 2020-03-06 01:46:02 Re: BUG #16285: bt_metap fails with value is out of range for type integer

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2020-03-06 04:09:51 Re: Psql patch to show access methods info
Previous Message Alvaro Herrera 2020-03-06 02:09:06 Re: [Patch] pg_rewind: options to use restore_command from recovery.conf or command line