Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

From: "K S, Sandhya (Nokia - IN/Bangalore)" <sandhya(dot)k_s(at)nokia(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, "T, Rasna (Nokia - IN/Bangalore)" <rasna(dot)t(at)nokia(dot)com>, "Itnal, Prakash (Nokia - IN/Bangalore)" <prakash(dot)itnal(at)nokia(dot)com>
Subject: Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core
Date: 2017-07-18 05:27:58
Message-ID: AM5PR0701MB26429388BB76340050939D6FD6A10@AM5PR0701MB2642.eurprd07.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers

Hi Craig,

While testing for another scenario of continuous postgres server restart, we got many cores of sh-QUIT and along with that we got cores for rm-QUIT. It is pointing to rm of the archive file but we were not able to get the bt as the stack is corrupted.

We got below info from gdb:
Core was generated by `rm ./Archive_000000020000000000000118'.

And also we were able to get this info:
4518 12490 0.0 0.0 11484 1356 ? Ss 10:59 0:00 postgres: archiver process archiving 000000020000000000000118.00000028.backup
4518 12704 2.0 0.0 7672 2932 ? S 11:00 0:00 \_ sh -c rm ./Archive_*; touch ./Archive_"000000020000000000000118.00000028.backup"; exit 0
4518 12707 0.0 0.0 344 4 ? S 11:00 0:00 \_ rm ./Archive_000000020000000000000118

In the Postgres configuration file ,we have this information.
archive_command = 'rm ./Archive_*; touch ./Archive_"%f"; exit 0'

So while executing this archive command, core was generated.
You pointed out earlier that issue might be happening during archive command and also all evidence for this crash are pointing to this same command.
Are there any suggestions to recover from this situation or on ways to debug the issue ?

Regards,
Sandhya

From: K S, Sandhya (Nokia - IN/Bangalore)
Sent: Wednesday, July 12, 2017 4:51 PM
To: 'Craig Ringer' <craig(at)2ndquadrant(dot)com>
Cc: pgsql-bugs <pgsql-bugs(at)postgresql(dot)org>; PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>; T, Rasna (Nokia - IN/Bangalore) <rasna(dot)t(at)nokia(dot)com>; Itnal, Prakash (Nokia - IN/Bangalore) <prakash(dot)itnal(at)nokia(dot)com>
Subject: RE: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

Hi Craig,

Here is bt after installing all the missing debuginfo packages.

(gdb) bt
#0 0x000000fff7682f18 in do_lookup_x (undef_name=undef_name(at)entry=0xfff75cece5 "_Jv_RegisterClasses", new_hash=new_hash(at)entry=2681263574,
old_hash=old_hash(at)entry=0xffffa159b8, ref=0xfff75ceac8, result=result(at)entry=0xffffa159a0, scope=<optimized out>, i=1, version=version(at)entry=0x0,
flags=flags(at)entry=1, skip=skip(at)entry=0x0, type_class=type_class(at)entry=0, undef_map=undef_map(at)entry=0xfff76a9478) at dl-lookup.c:444
#1 0x000000fff76839a0 in _dl_lookup_symbol_x (undef_name=0xfff75cece5 "_Jv_RegisterClasses", undef_map=0xfff76a9478, ref=0xffffa15a90,
symbol_scope=0xfff76a9980, version=0x0, type_class=<optimized out>, flags=<optimized out>, skip_map=0x0) at dl-lookup.c:833
#2 0x000000fff7685730 in elf_machine_got_rel (lazy=1, map=0xfff76a9478) at ../sysdeps/mips/dl-machine.h:870
#3 elf_machine_runtime_setup (profile=<optimized out>, lazy=1, l=0xfff76a9478) at ../sysdeps/mips/dl-machine.h:916
#4 _dl_relocate_object (scope=0xfff76a9980, reloc_mode=<optimized out>, consider_profiling=0) at dl-reloc.c:259
#5 0x000000fff767ba10 in dl_main (phdr=<optimized out>, phdr(at)entry=0x120000040, phnum=<optimized out>, phnum(at)entry=8,
user_entry=user_entry(at)entry=0xffffa15cf0, auxv=<optimized out>) at rtld.c:2070
#6 0x000000fff7692e3c in _dl_sysdep_start (start_argptr=<optimized out>, dl_main=0xfff7679a98 <dl_main>) at ../elf/dl-sysdep.c:249
#7 0x000000fff767d0d8 in _dl_start_final (arg=arg(at)entry=0xffffa16410, info=info(at)entry=0xffffa15d80) at rtld.c:307
#8 0x000000fff767d3d8 in _dl_start (arg=0xffffa16410) at rtld.c:415
#9 0x000000fff7679380 in __start () from /lib64/ld.so.1

Please see if this could help in analysing the issue.

Regards,
Sandhya

From: Craig Ringer [mailto:craig(at)2ndquadrant(dot)com]
Sent: Friday, July 07, 2017 1:53 PM
To: K S, Sandhya (Nokia - IN/Bangalore) <sandhya(dot)k_s(at)nokia(dot)com<mailto:sandhya(dot)k_s(at)nokia(dot)com>>
Cc: pgsql-bugs <pgsql-bugs(at)postgresql(dot)org<mailto:pgsql-bugs(at)postgresql(dot)org>>; PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org<mailto:pgsql-hackers(at)postgresql(dot)org>>; T, Rasna (Nokia - IN/Bangalore) <rasna(dot)t(at)nokia(dot)com<mailto:rasna(dot)t(at)nokia(dot)com>>; Itnal, Prakash (Nokia - IN/Bangalore) <prakash(dot)itnal(at)nokia(dot)com<mailto:prakash(dot)itnal(at)nokia(dot)com>>
Subject: Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

On 7 July 2017 at 15:41, K S, Sandhya (Nokia - IN/Bangalore) <sandhya(dot)k_s(at)nokia(dot)com<mailto:sandhya(dot)k_s(at)nokia(dot)com>> wrote:
Hi Craig,

The scenario is lock and unlock of the system for 30 times. During this scenario 5 sh-QUIT core is generated. GDB of 5 core is pointing to different locations.
I have attached output for 2 such instance.

You seem to be missing debug symbols. Install appropriate debuginfo packages.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message amutu 2017-07-18 08:24:44 BUG #14749: log_destination should be log_directory in 10-release note
Previous Message Tom Lane 2017-07-17 20:09:04 Re: PgFDW connection invalidation by ALTER SERVER/ALTER USER MAPPING

Browse pgsql-hackers by date

  From Date Subject
Next Message Vik Fearing 2017-07-18 06:26:10 Re: New partitioning - some feedback
Previous Message Stephen Frost 2017-07-18 04:28:01 Re: pg_stop_backup(wait_for_archive := true) on standby server